Overview on NCEP's Data Assimilation System
                  ------------------------------------------


    The climate data assimilation system (CDAS) used for Reanalysis is
very similar to the global data assimilation system (GDAS) that is
run operationally.  They are both run on 6-hour cycles.



    |                     |
    |-------------------->|                             6 hour forecast is made
    |                     |                             from 00Z analysis. This
   00Z                   06Z                            becomes the first guess
 analysis              forecast
                          |
                          |
            03Z obs       |       09Z obs               data from 6 hour window
                \         |         /                   (i.e., 03Z-09Z) is used
                 \        |        /                    These are the 'obs'.
                  \       |       /                     Quality control will 
                   \      |      /                      reject some of the data.
                    \     |     /
                     \    |    /
                      \   |   /
                       \  |  /
                        \ | /                           Assimilation sys. takes
                         \|/                            1st guess and the obs.
                          |                             produces a 06Z analysis
                          |                             Within the 6 hr window,
                          |                             all obs. are used.
                          |
                          V

                          |                     |
                          |-------------------->|       6 hour forecast is made
                          |                     |       (cycle is repeated)
                         06Z                   12Z
                       analysis              forecast

                                                |
                                                |
                                  09Z obs       |       15Z obs
                                      \         |         /
                                       \        |        /
                                        \       |       /
                                         \      |      /
                                          \     |     /
                                           \    |    /
                                            \   |   /
                                             \  |  /
                                              \ | /
                                               \|/
                                                |
                                                |
                                                V

                                                |                     |
                                                |-------------------->|
                                                |                     |
                                               12Z                   18Z
                                             analysis              forecast





To make an analysis, a 6 hour forecast and observations merged together. Such 
a procedure is necessary because the number of degrees of freedom in the 
atmosphere is much greater than the number of observations made, and we have 
an under determined system.  The merging consists of finding an atmospheric 
state (A) that is closest to the first guess and the observations.  
Symbolically you want to find A that minimizes

        W1((first_guess - A)^2) + W2((observations - A)^2)

where W1 and W2 are weighting functions.  The above equation is symbolic, and 
in matrix notation would be written as

                  T                       T
       J = (F - A)  W1 (F - A) + (O - L A)  W2 (O - L A)

where
     A is the column vector of the atmospheric state
     F is the column vector of the first guess (6 hour forecast)
     W1 and W2 are square matrices describing the weights
     O is a column vector of the observations
     L is matrix to convert the atmospheric state vector to a synthetic 
       observations vector


--------------------------------------------------------------------------

The following is by John Derber.  It applies to the operational
system.  The differences are 1) Reanalysis is run at T62, 2) Reanalysis
neither uses interactive retrievals nor SSM/I winds.

--------------------------------------------------------------------------



SSI analysis system documentation as of Sep. 8, 1994

Documentation
The initial version of the SSI analysis system is presented in
Parrish and Derber (1992).  Some initial updates are contained in
Derber et al. (1991) and a more complete description of the latest
version is being prepared by Rizvi and Parrish(1994).

Numerical/Computational Properties

Horizontal Representation
The analysis variables are defined spectrally.  For comparison to
the observations, the variables are transformed to Gaussian grid
and then linearly interpolated to observation location.

Horizontal Resolution
Same as forecast model.  Spectral triangular 126 (T126).  The
Gaussian grid of 384x192 contains two additional rows over that
used in model (north and south pole points).  This resolution is
essentially equivalent to 1x1 degree latitude/longitude.

Vertical Representation and domain
Same as forecast model.  Sigma coordinate. For a surface pressure
of 1000 hPa, twenty eight levels from 995hPa to 2.7hPa

Computer/Operating System
Currently optimized for Cray computers.  It has run on Y/MP, C90,
and EL at various resolutions with up to 14 processors.

Computational performance
On C90 at full resolution, the wall clock time using 14 processors
is about 5 minutes.

Analysis Components and basic properties

Basic Problem
The problem being solved is to minimize the weighted fit of the
analysis to the guess plus the weighted fit of the analysis to the
observations plus the weighted fit of the divergence tendency to
the guess divergence tendency.  The weights are given by the
statistics described below.

Analysis variables
The analysis variables can be uniquely transformed into the model
variables of vorticity, divergence, temperature, ln(surface
pressure) and specific humidity.  The analysis variables are
normalized vorticity, non-balanced divergence, non-balanced
temperature, surface streamfunction and specific humidity.  Each of
these are deviations from the guess, decomposed in the vertical
based on the vertical error covariance and are normalized with the
standard deviation of the error.  The balanced part of the
divergence and the temperature are implied using a linear balance
equation with empirical friction from the streamfunction.

Observation types

Currently, the analysis system uses the following data:

1. Rawindsondes -- winds, temperatures, specific humidity, surface pressure
2. Conventional aircraft reports -- winds
3. Acars aircraft reports (above 700mb) -- winds, temperatures
4. Cloud tracked winds from GOES, Japanese and European satellites
5. Surface marine observations -- winds, temperatures, specific
     humidity, surface pressure
6. Surface land observations -- specific humidity, surface pressure
7. SSM/I wind speeds (with assigned direction)
8. TOVS temperature retrievals in Southern Hemisphere (over land 
     above 100mb only).
9. Interactive temperature retrievals in Northern Hemisphere (over 
     land above 100mb only). 
10. Dropwindsondes -- winds, temperatures, specific humidity
11. Australian sea level pressure boguses

Observational error statistics

The observational error statistics can vary with each observation
location.  They are currently input from the quality control
routines.

Background error statistics

The background error statistics are used to weight the background
(first guess) field.  They are defined spectrally and are currently
nearly homogeneous around a latitude band.  The statistics are
calculated by scaling the statistics from a sequence of differences
between 24 and 48 hour forecasts valid at the same time.

Balance constraint

A nonlinear balance constraint is currently being used in the
analysis system.  The nonlinear balance constraint is linearized
around the guess in the analysis system.  Vertical advection,
surface friction and diabatic heating are not yet incorporated. 
The analysis system penalizes for differences from the guess
divergence tendency.  The penalty is defined in spectral space and
the weights are defined as for the background error statistics