Craig Bishop: The Secrets of Model/Background Error Covariance Revealed

This is my first content post for the satellite data assimilation summer school, and the first big insights that I’ve gotten by being here. These come courtesy of Dr. Craig Bishop of the Naval Research Lab in Monterrey, California. In my previous post, I talked about the difficulty I had of envisioning model error as a random process with zero mean and some specified covariance \mathbf{B}. Craig gave a pair of illuminating lectures about this topic.

His first point was that we can actually envision model error using basic probability notions. Imagine an infinite collection of earths, each of which has a weather forecast office with a forecast model. Each of these earths has a true state, a forecast, and an observational history. We can collect all of the earths that have the same current true state, and examine the distribution of different forecasts. The resulting density, which we can denote p\left(x^f|x^t\right), is called the “fixed truth” error. The important idea here is that the error is being described statistically, and it’s a mistake to put too much physical emphasis on individual correlations.

Given this basic statistical description of model error, the practical issue becomes actually estimating the “true” model error covariance. We discussed a few. One classic method was proposed by another speaker at the summer school, John Derber, along with his coauthor Parrish, in paper in the early 1990s. The basic idea is to make pairs of forecasts, one 48 hours, and the other for 24 hours, but both valid for the same end time. By taking differences, and averaging over all of the pairs, we can get a sense of the variability in the model starting from different initial conditions and integrating for two different lengths of time. This is usually called the “NMC Method”. Another method is to use a perturbed initial condition to generate an ensemble of forecasts, and then calculate the sample covariance of these different forecasts, much like you would do in an ensemble data assimilation method. These methods yield a static background error covariance, which doesn’t change with time.

In his second talk, Craig highlighted some of the issues with using flow dependent error covariances from ensemble methods. The raw sample error covariance can contain noisy features that persist far from the physical location of interest, which are understood to be spurious, and a result of the finite ensemble size. Localization is the name for the methods that do away with these unwanted features, and Craig spent the remainder of his talk explaining how different people do this, and then presenting a newer method of doing it that he’s been working on. To me, localization is one of the unpleasant things about ensemble methods. It’s a fix for a shortcoming of the method, but there’s not a really satisfying way to do it.

The real contribution from these talks for me was a much more intuitive understanding of how background error covariances should be thought about. The frequentist ideas that I talked about above are a really useful interpretation. Probably this isn’t a sticking point for anyone else, but it really was for me.

Leave a comment