[This lecture was partially created with the support of Artificial Intelligence tools (ChatGPT)]
1. Introduction
Deterministic models play a central role in environmental science by providing a structured and theory-driven way to estimate, predict, and understand environmental variables. At their core, deterministic models are based on the assumption that the behavior of a system can be fully described by a set of known relationships, typically expressed through mathematical equations derived from physical, chemical, or biological principles. Unlike stochastic approaches, which explicitly incorporate randomness and uncertainty, deterministic models produce a single, reproducible output for a given set of inputs and parameters. This characteristic makes them particularly valuable for exploring cause–effect relationships and for testing hypotheses about how environmental systems respond to changes in external conditions.
In practice, deterministic models are widely used to estimate variables such as air pollutant concentrations, water quality indicators, soil moisture, and ecosystem productivity. For example, atmospheric dispersion models simulate how pollutants emitted from a source are transported and diluted in the atmosphere based on wind speed, turbulence, and chemical reactions. Similarly, hydrological models estimate river discharge or groundwater levels by representing processes such as precipitation, infiltration, evapotranspiration, and runoff. These models are typically built upon conservation laws, such as the conservation of mass or energy, and are implemented through differential or algebraic equations that describe how environmental variables evolve over space and time.
One of the key strengths of deterministic models lies in their interpretability. Because they are grounded in established scientific laws, their structure provides insight into the mechanisms driving environmental processes. This makes them especially useful for scenario analysis, where researchers and policymakers assess the potential impacts of interventions such as emission reductions, land-use changes, or climate variability. By systematically adjusting input variables, deterministic models can help identify critical thresholds, sensitivities, and feedbacks within environmental systems.
In recent years, deterministic models have increasingly been integrated with data-driven and probabilistic approaches to improve their robustness and predictive power. Hybrid frameworks, for instance, combine physically based equations with statistical or machine learning components to better capture complex dynamics and reduce biases. Despite these advances, deterministic modeling remains a foundational tool in environmental analysis, providing a coherent framework for synthesizing scientific knowledge and supporting evidence-based decision making.
Understanding deterministic models involves not only learning the mathematical formulations and computational techniques but also developing a critical awareness of their assumptions, limitations, and appropriate domains of application. When used thoughtfully, deterministic models offer powerful insights into environmental systems and serve as an essential bridge between theoretical understanding and real-world problem solving.
Definition of deterministic model
A deterministic model produces the same output from a given starting condition or initial state, because no randomness is involved.
The output from a deterministic model may include a single variables or multiple variables, which are unknown. Additionally, the model may include additional unknown dynamic variables that are essential for describing the state of the system. These are called "state variables" and may be also unknown.
A deterministic model is typically given by a system of equations relating input variables, state variables and output variables. The number of equations is at least equivalent to the number of unknowns. Equations may be physically-based, or chemically-based, or based on other theories governing the evolution of the process. In alternative, equations may be empirical.
2. Added value of deterministic models
Deterministic models do not account for randomness and therefore are computationally more efficient with respect to solutions accounting for randomness and uncertainty. Therefore, deterministic models are often preferred for the representation of complex systems. Being based on a theoretically based description of the system, deterministic models are transparent and look rigorous in their representation.
Furthermore, being based on a rigorous representation of the system, deterministic model can be in principle applied without calibration and therefore deliver solutions to overcome the challenge of lack of observed data, which are instead essential for the application of data driven approaches. For instance, in absence of observed river flow, these can be reconstructed by routing rainfall data and other meteorological variables through a rainfall-runoff model.
3. Limitations of deterministic models
Deterministic models produce a single output from a given set of inputs based on fixed equations. Their applicability is constrained by several important limitations. First, they rely heavily on the completeness and accuracy of the underlying process representation. Environmental systems are inherently complex, often involving nonlinear interactions, spatial heterogeneity, processes that are difficult to measure directly, and often only partially understood. Thus, deterministic formulations inevitably simplify or omit processes, leading to structural model error. Second, input data uncertainty propagates directly into outputs. Measurements of environmental variables (e.g., emissions, boundary conditions, land use) are often sparse, noisy, or biased, yet deterministic models typically do not explicitly account for this uncertainty, giving a false sense of precision.
As a result, deterministic models require simplifying assumptions and parameterizations, which can introduce uncertainty and limit their accuracy.
Third, parameter estimation poses a critical challenge. Many environmental models include parameters that cannot be directly measured and must be calibrated, often resulting in equifinality—different parameter sets producing similar outputs—thereby reducing interpretability and predictive robustness. Model calibration and validation are therefore essential steps, involving the adjustment of model parameters to match observed data and the evaluation of model performance against independent datasets. Even though deterministic models do not explicitly include randomness, uncertainty still arises from imperfect knowledge of inputs, parameters, and system structure.
Fourth, deterministic models struggle to capture stochastic variability and extreme events, which are intrinsic to environmental systems (e.g., sudden storms, ecological disturbances). This limits their reliability, particularly for risk assessment and decision-making under uncertainty.
Finally, scale mismatches (temporal and spatial) between model structure and real-world processes can introduce systematic biases. Finally, deterministic models can be computationally expensive when representing high-resolution systems, restricting their use in large-scale or real-time applications. For these reasons, probabilistic, stochastic, or hybrid modeling approaches are often preferred to complement deterministic frameworks and better represent uncertainty and variability.
Box: non-linearity and chaos
Nonlinearity refers to systems in which outputs are not proportional to inputs, meaning small changes can produce disproportionately large or complex effects. In environmental and physical systems, this often arises from feedback loops, thresholds, and interactions among multiple components. Because of nonlinearity, such systems cannot be accurately described by simple additive relationships, and their behavior may change qualitatively across different conditions.
Chaos is a property of certain nonlinear dynamical systems characterized by extreme sensitivity to initial conditions, often summarized as the “butterfly effect.” Even infinitesimally small differences in starting states can lead to vastly different trajectories over time, making long-term prediction inherently difficult despite the system being deterministic. Chaotic systems are not random, but they appear irregular and unpredictable due to this sensitivity. Together, nonlinearity and chaos explain why many natural systems—such as climate, ecosystems, and fluid dynamics—exhibit complex, evolving patterns that challenge precise prediction.
4. A first meaningful example: climate models
Climate models (see here for more details) and meteorological models are meaningful examples of deterministic models. They describe with deterministic relationships the dynamics of climate and meteorological processes by typically relying on conservation equations applied at the grid scale (see Figure below). By feeding these models with proper initial and boundary conditions, and assigned scenarios of future emissions for climate models, these models deliver projections of future climate and meteorological predictions.

Figure 1. Grid scale representation of climate models. Source: https://www.nccs.nasa.gov/services/climate-data-services.
In a single model run of climate models, there is no random component (see here below how randomness and uncertainty can be accounted for in deterministic simulations) and therefore no estimation of uncertainty.
The size of the grid of climate models is progressively increased in time, in order to refine the representation of the involved processes (Figure 2). Today, climate and meteorological models are computationally very intensive. Their demand for computational resources is preventing the possibility of running many and many simulations and is therefore the reason why these models do not account for randomness.

Figure 2. The grid size of climate models increased during time. Source: IPCC.
5. A second meaningful example: the bucket model
The “bucket model” is a simple conceptual framework often used in environmental science, hydrology, and systems thinking to describe how inputs, outputs, and storage interact within a system. Imagine a bucket representing a system—such as a watershed, a soil column, or even the human body. Water (or another resource) flows into the bucket through inputs like rainfall, and leaves through outputs such as evaporation, runoff, or leakage. The level of water in the bucket at any given time represents the system’s storage.
In this model, the key idea is balance: if inputs exceed outputs, the water level rises; if outputs exceed inputs, the level falls. When the bucket becomes full, any additional input leads to overflow, symbolizing processes like flooding or excess runoff. Conversely, if inputs are too low, the bucket may empty, representing drought or depletion. The model helps visualize how systems respond dynamically to changing conditions over time.
The bucket model is especially useful because it simplifies complex processes into an intuitive analogy. For example, in hydrology, precipitation fills the “bucket” (soil or reservoir), while infiltration, evaporation, and streamflow act as outputs. In physiology, it can represent fluid balance in the body. Despite its simplicity, the model highlights important concepts such as equilibrium, capacity limits, and feedback mechanisms.
However, it is also important to recognize its limitations. Real-world systems are often more complex than a single bucket: they may involve multiple interconnected “buckets,” delayed flows, or nonlinear relationships. Still, as an introductory tool, the bucket model provides a clear and accessible way to understand how systems store and transfer resources over time.

Figure 3. Bucket model. Source: The Applied Science Lab at the University of Idaho.
6. Structure of deterministic models
Deterministic models can be distinguished into two main categories:
- Process based models;
- Empirical models, or black-box models.
A process-based model describes a system by explicitly representing the mechanisms that drive it. It uses mathematical equations based on scientific theory to simulate how components interact over time. The model attempts to replicate real-world processes rather than only fitting observed data. This approach allows researchers to explore scenarios and understand system dynamics. However, such models often require detailed knowledge, in terms of understanding of the underlying processes and description of the system at the microscale.
An empirical model is built by analyzing experimental or observational data to identify relationships between variables. Instead of representing underlying processes, it focuses on empirical relationships in the data. Methods such as regression, machine learning, or statistical fitting are commonly used. These models can provide accurate predictions for known conditions. Their limitation is that they may lack explanatory power about why the system behaves as it does.
Notably, an interesting grey zone exists between process-based models and empirical ones. An example is given by the physics-informed statistical representations. These combine physical knowledge with statistical methods to model complex systems. They incorporate known physical laws, constraints, or equations into statistical modeling frameworks. This helps ensure that predictions remain consistent with established scientific principles. Such models can improve accuracy when data are limited or noisy by guiding learning with physics. They are increasingly used in fields like environmental science, engineering, and climate modeling.
7. An interesting option: machine learning
Machine learning models are computational methods that allow computers to learn patterns and relationships from data without being explicitly programmed for every task. They are a central part of the field of Machine Learning, which focuses on developing algorithms that improve their performance through experience.
These models analyze datasets to identify structures, trends, and correlations that can be used for prediction or decision-making.
Instead of relying solely on predefined equations, they adapt their internal parameters based on training data. Machine learning models are commonly trained using large datasets to capture complex relationships between input variables and outputs. Once trained, the model can make predictions or classify new, unseen data. There are several main types of machine learning approaches, including supervised learning, unsupervised learning, and reinforcement learning.
In supervised learning, models are trained with labeled examples, where the correct output is already known. In unsupervised learning, the model tries to discover hidden patterns or groupings within unlabeled data. Reinforcement learning focuses on learning through interaction with an environment, using rewards and penalties to guide improvement.
Machine learning models can take many forms, such as decision trees, neural networks, and support vector machines.
They are widely used in applications like image recognition, recommendation systems, and predictive analytics.
The performance of these models depends on factors such as data quality, model complexity, and training methods.
Proper evaluation and validation are important to ensure that models generalize well to new data. As computing power and data availability increase, machine learning models continue to play a growing role in science, technology, and industry.
7.1 A meaningful example of machine learning model: Random Forest
A Random Forest is a machine learning algorithm that can be used for both classification and regression tasks. It is part of the ensemble learning family, where multiple models are combined to improve overall predictive performance.
The algorithm makes classification and regression by following a decision tree, namely, a decision structure that uses a tree-like model of decisions and their possible consequences, including random outcomes, resource costs, and utility. Decision trees are frequently used in operations research to help identify a strategy most likely to reach a goal. They are also a popular tool in machine learning. For an example, see Figure 4. In this case, splits in the tree are in the form of yes/no. In general, splits may be regulated by probabilities, like 30% yes and 70% no and a cost/utility may be associated to given steps of the decision.

Figure 4. Example of a decision tree. By Stefan Lew, based on work of WolfgangW, public domain, via Wikimedia Commons.
Instead of relying on a single decision tree, a Random Forest constructs a large number of decision trees during the training phase. Each tree is built using a different random subset of the training data, a process known as bootstrapping or bagging. Additionally, at each split in a tree, the algorithm considers only a random subset of the available features to determine the best split.
This randomness introduces diversity among the trees and helps reduce correlation between them. As a result, the model becomes more robust and less prone to overfitting compared to a single decision tree. When predictions are made, each tree in the forest produces its own output independently. For classification problems, the final prediction is determined by majority voting among all trees. For regression problems, the predictions from all trees are averaged to obtain the final result.
Random Forest models are known for their strong predictive accuracy and their ability to handle large datasets with many features. They are also relatively robust to noise and missing data in the training set. Another advantage of Random Forests is that they can estimate feature importance, allowing researchers to understand which variables contribute most to the model’s predictions. Due to these strengths, Random Forests are widely applied in fields such as healthcare, finance, environmental science, and bioinformatics. Overall, the algorithm provides a powerful and flexible approach for building reliable predictive models while maintaining good generalization performance on unseen data.
In the context of environmental sciences, Random Forest models are frequently used for predicting unknown variables depending on known inputs. Another example of a decision tree is given by Figure 5.

Figure 5. Schematic of a random forest algorithm. By TseKiChun, CC BY-SA 4.0, via Wikimedia Commons.
To be completed.
Last revised on March 31st, 2026
- 30 viste




