g2subset v1.3.4

g2subset is program that runs on a http server that allows users to download subsets of grib2 files.

Characteristics 
               cgi-bin script: program that generates web pages
               perl wrapper for wgrib2:  a perl script that calls wgrib2
               makes regional subsets:  makes cookie cutter sections of a grid
               can interpolate: can convert to custom grids
               extract point values: can make grib files with selection lat-lon point
                      using nearest neighbor interpolation
               select fields: can extract based on time, field and level

Requirements:
               web server
               grib2 files that you want to distribute
               perl
               wgrib2

Why run g2subset:
               Bandwidth is a resource. Downloading subsets saves time and money.
               Of course this is not true in Bizzaro world where subsetting means 
               fewer TB are downloaded and consequently your funding is cut.

Components of g2subset
	g2subset.pl
		This is a small perl script that normally runs in the "cgi-bin" directory
		of the web server.  At minimum, this script specifies the directory that
		g2subset will serve and than calls it "g2sub_main.pl".  Here is a sample 
		g2subet.pl

		#!/usr/bin/perl -w -I/home/wd23ja/bin
		require ("g2subsetmod_grid.pl");
		$dir="/var/ftp/pub/";
		&g2sub_main($dir);
		exit;

	g2subsetmod_grib.pl
		This is the main perl script.  It creates the web pages, reads the user
		responses and runs wgrib2.  If you are using the above g2subset.pl,
                this perl script will reside in /home/wd23ja/bin/.

	.g2subrc
		This is a a "resource" file and placed in each data directory.  It provides
		some customization/initialization for the data.  Here is a simple .g2subrc
                See the subroutine read_g2subrc in the g2subset source code.

		title=rotating GFS forecasts
		ncol=4
		gribfilter=yes
		files_pat='grib2$'
                     : the files_pat is a regex for the grib2 files
		vars=4LFTX 5WAVA 5WAVH ABSV CAPE CIN CLWMR CWAT GPA HGT HPBL ICEC LAND LFTX O3MR POT \
                     : vars is a list of grib2 variables that can be selected.  (wgrib2 names)
		levs=0-0.1_m_below_ground 0.1-0.4_m_below_ground 0.33-1_sigma_layer 500_mb 50_mb \
                550_mb 600_mb 650_mb 700_mb 
                     : levs is a list of levels that can be selected.  (wgrib2 levels)  Blanks
                       are replaced by underscores.
                      

		title = title for the web page
		ncol = number of columns for listing
		gribfilter = yes or no          (turns on grib2 filtering)	
		files_pat = 'match pattern'     regular expression for files to serve
		vars = list of grib variables   (spaces are replaced by underscores)
		lev = list of levels            (spaces are replaced by underscores)

		hours =				yes/no to hour filter
		days =				yes/no to day filter
		months = (x)			yes/no to month filter
		forecast = (x)			data are forecasts (use verification time)
		nosubregion = (x)		do not allow subregions
		nopoints = (x)			do not allow point output
		noregrid = (X)			do not regrid (interpolation to new grid)

Principles of Operation
               The cgi-bin part of g2subset provides a pointer to a directory.  This directory or a
               subdirectory holds the grib2 files.  Traditionally these grib2 files are stored on
               a directory that is visable to the outside world through the http server.  This is
               so that "partial http downloading" technique will work.

               http://www.cpc.ncep.noaa.gov/products/wesley/fast_downloading_grib.html

               The g2subset perl scripts create the web pages.  When the user requests a "download", 
               all the selections are converted into a single wgrib2 command.  You can see the command 
               by selecting "Show the URL for scripting downloads" and then clicking on "Download".

Index files
               Grib files are flat files; they do not include indexing.  The g2subset
               script uses wgrib2 inventories as the index file.  The default suffix of the
               index file is .inv  and are created by

                   wgrib2 GRIBFILE > GRIBFILE.inv

               The nomads.ncep.noaa.gov uses .idx suffix for the index file.  They
               produce the .idx file by

                   wgrib2 GRIB2_FILE > GRIB2_FILE.idx

               G2subset will work without index files but it makes the system slow
               as the entire data file will be read everytime there is a data request.

               G2subset supports an index file that uses the verification time
               instead of the reference time for date code matching.  It has a
               suffix of .inv-verf.


Speed
               When making regional subsets, the server needs to decode the grib file, make
               the subset (cookie cutter, interpolation) and encode the grib file.  For speed,
               the input files should be in simple, complex or aec packing.  (Jpeg2000 is
               very slow.)  For output, the simple packed files (default) uses the least amount
               of CPU at the expense of bandwidth.  AEC packing uses more CPU but produces
               much smaller files.  However, one should be aware that some users may have
               difficulty with decoding AEC compressed files.  Complex packing is another
               possibility when CPU is less of a concern.
               
               Another way to speed up the operation is to compile wgrib2 with OpenMP and set
               $OMP_NUM_THREADS to a small number between 2 to 4.  This helps complex and
               simple packing.

               In my tests (2012), you can distribute continental USA 1-km grids using complex 
               packing and the OpenMP version of wgrib2.  If you need even more speed, 
               wgrib2m-type processing can be used (untested).

Things to do
               Wgrib2 now supports the -fgrep/-egrep and -i_file options.  G2subset can be changed
               so that "cat X.inv | egrep A | egrep B | wgrib2 -i X ..." is replaced by
               "wgrib2 -i_file X.inv -egrep A -egrep X ..." which is cleaner.

               The interpolation options are "rough".  The parameters needed are not 
               explained by the web pages and you need to see the wgrib2 documentation.
               The interface can be done better.