What is Data Assimilation?

My new postdoctoral job involves the investigation of ways that we can use observations from a special class of instruments to make better numerical weather model forecasts. This sits inside the field called “data assimilation” (DA), and I thought a good first post on my postdoctoral work would be a no frills discussion of what DA is, from a broad perspective.

For meteorology (and many other scientific and engineering disciplines), a forecast is required on a regular basis, which requires numerical models. The real atmosphere is complicated, with processes going on at all ranges of spatial and temporal scales, and so it’s very hard to model well, especially factoring in the limits on computational resources. One way to improve the quality of a numerical weather forecast, relative to real-time observations, is to use observations to choose initial conditions, boundary conditions, or model parameters. The methods to do this belong to the field called data assimilation.

In meteorology, observations are operationally used to better constrain the model forecast by adjusting the model output at a particular time (called the analysis time), and then using that adjusted output as an initial condition to make the forecast. The basic idea used in the adjusting is the same as the most basic model covered in every elementary statistics course: the least-squares regression line. Of course, it’s much harder computationally and algorithmically because an operational weather prediction model has millions of variables (instead of two), but the mathematics is very similar. Essentially, if we assume our model and observations both have normally distributed errors, then the analysis state will also have normal errors, with the minimum variance of all possible states, so that it is “optimal”.

For the more mathematically inclined, the simplest case of variational data assimilation used in meteorology is called three-dimensional variational (3DVar) data assimilation, and it amounts to minimizing

$J(x) = (x-x_b)^mathsf{T}B^{-1}(x-x_b) + (y-h(x))^mathsf{T}R^{-1}(y-h(x))$

 over all possible vectors $x$, where $x_b$ is the “background”, “first guess”, or model output, and $y$ denotes the observations. The matrix $B$ is the covariance matrix associated to the Gaussian error distribution of the model forecast $x_b$, and $R$ the covariance matrix associated with the observational errors. Note that we assume that the model and observation errors are independent. Lastly, the observations and model state might not lie in the same space, and are connected by the operator $h$, which sends model output into the space of observations, so that we can calculate the departures.

After staring at the functional $J$ above for a while, you should see that it is penalizing the misfit of the analysis with the model state and the observations, each weighed by the inverse of the corresponding covariance matrix, which we can think of as confidence. In a later post, I’ll talk about other DA frameworks, and what questions seem interesting to me personally.

Leave a comment