Finding the database

First, you need to find out where the databases are kept. Generally, go to the $SDDAS_DATA directory and follow the hierarchy down to the instrument level. At that point, a "Database" directory should be present which has the database.

For example, if your $SDDAS_DATA directory is "/data" and you wish to look at the HEPS (an instrument on the UARS satellite) database, go to /data/UARS/UARS-1/PEM/Database. There should be files called HEPS.HD.DBF, HEPS.HD.NDX, HEPS.I.DBF, and HEPS.I.NDX.

These files are the database files. The header/data pair of files are databased in the .HD.DBF and .HD.NDX files. The VIDF files are databased in the .I.DBF and .I.NDX files.

Now, the databases are in a binary format and are not human readable. To view the database, use a program called db_list. db_list will show the contents of the database by specifying the full name of the database (DBF file) on the command line.

For example: "db_list HEPS.HD.DBF" or "db_list HEPS.I.DBF".

This will produce the following output (for the HEPS.HD.DBF file):

Record#  V_INST   B_YR B_D   B_MSEC E_YR E_D   E_MSEC       SIZE P P ST ARC DATE     
------- +-------- ---- --- -------- ---- --- -------- ---------- - - -- --- -------- 
      1  HPDA     1970   2 36480000 1995 179    10000      22352 0 2 00  36 19980115 
      2  HPDA     1991 258  3164000 1991 258 10373000       8528 0 2 00   1 19980115 
      3  HPDA     1991 258 10373000 1991 258 17582000       8848 0 2 00   1 19980115 
      4  HPDA     1991 258 17582000 1991 258 24791000       8260 0 2 00   1 19980115

...

An in-depth look at a single database entry

Given this database entry generated by db_list:


Record#  V_INST   B_YR B_D   B_MSEC E_YR E_D   E_MSEC       SIZE P P ST ARC DATE     
------- +-------- ---- --- -------- ---- --- -------- ---------- - - -- --- -------- 
     10  HPDA     1991 258 60836000 1991 258 68045000       9064 0 2 00   1 19980115

Let us take a look at this entry completely:

Record# - a counter of where the entry lies in the database
V_INST - the name of the virtual instrument for this entry
B_YR - beginning year of the data file
B_D - beginning day of the data file
B_MSEC - beginning millisecond of the data file
E_YR - ending year of the data file
E_D - ending day of the data file
E_MSEC - ending millisecond of the data file
SIZE - size of the data file(s) in the entry (described below)
P - the first P is the pre-processing flag - the only acceptable value is 0
P - the second P is the post-processing flag. (described below)
ST - must be 00 (this is currently unused, but may be used in the future (keep 00)
ARC - this is the archive label (more on this later)
DATE - date that the entry was made. This is in the format YYYYMMDD.

The SIZE field is a numeric value which is *NOT* used by our software, but in the future, it would be really nice to make use of this value. This value must be 8 digits or under. There are several databases in existence which do not have this value in there. Try to make any future databases with this value and make them correct.

The SIZE for the I file database should be equal to the size of the I file. The I file is made by running mk_idf on the VIDF file. For the HD database, it is the sum of the header file and data file for that time segment.

For consistancy, archive sites are highly encouraged to use the gnu gzip program for all data even though it may be compressed using the Unix compress program. What this means for the database entries is that the preprocessing flag should be 0 and the post processing flag should be 2 for the HD.DBF and 4 for the I.DBF.

Other acceptable values for the post processing flag are:

0 - No post-processing.

1 - Use the Unix program "uncompress" to post process.

2 - Use the script "Gunzip" which is based on the gnu program "gunzip" to post process.

3 - This value has been deprecated and now means NOTHING! DO NOT USE!

4 - Use the script "gunzip_and_mk_idf" to post process. Use for ALL entries in the I file database.

The archive label is a bit of past history and is hoped to be deprecated. The field is exactly three characters long and the default value is "xxx". This label is sent by the client to the server to instruct the server where to find its data. This is hopefully being phased out to where the server is required to find/keep track of its own data. This is a potentially slower way to do this; however, with server sites becoming more common, and having them duplicate data, this is the best way to make sure data can be found REGARDLESS of where the data is.

Any new databases should have an "xxx" in the field.

The ST (or STATE) field is exactly two characters. This field is not currently being used, but maybe used in the near future. Please keep this is as "00".

Fixing database entries on archive sites

Database entries at some point may need to be corrected or updated. This is not an easy task so take care that this is done carefully. If the database is slightly wrong, files may be incorrectly promoted, plotting ability will be severely limited, and other strange problems can occur on the client side.

Additionally, if there in an inconsistancy with the promoting of data from an archive site or any other problem with promotion of data, there is a possibility that the archive site has an erroneous database.

The proper way to correct the database is to convert the database to ASCII and manually edit the ASCII database files. Use your favorite editor and make sure files are saved in ASCII format.

The first step is to convert the database(s) to a form that is human readable and computer parsable. This is done with the db_2asc program.

For example: "db_2asc HEPS.HD.DBF" or "db_2asc HEPS.I.DBF".

This will produce the following output (for the HEPS.I.DBF file):

HPDB    |1980|  1|       0|2020|  1|       0|      2018|0|4|00|901|19951213|
HPDA    |1980|  1|       0|2020|  1|       0|      2300|0|4|00|901|19951213|
HPSC    |1970|  1|       0|2020|  1|       0|      7483|0|4|00|901|19951213|
HPSB    |1970|  1|       0|2020|  1|       0|     14330|0|4|00|901|19951213|
HPSA    |1970|  1|       0|2020|  1|       0|     30436|0|4|00|901|19951213|

All spaces will be ignored so if they are there, that's fine. If not, it does not matter.

You will notice that this information is exactly the same as the db_list information, but there are "|" (pipe symbols) separating each of the values. Take care that the pipe symbols remain as these are what make the file easily parsable.

With the db_2asc program, it will write all this information to standard output. For us to edit the information, all the information must be stored in a editable file. Do this by redirecting the output to a file called hd.asc (for the HD.DBF file) or i.asc (for the I.DBF file). The files must be called hd.asc and i.asc or else the UpdateDb script which is run later will not work. Furthermore, you may update only one of the databases at a time.

For example: "db_2asc HEPS.HD.DBF > hd.asc" or "db_2asc HEPS.I.DBF > i.asc".

The files hd.asc and i.asc may be edited at will.

After both files have been edited and corrected completely, the information must be put back into the database.

Do this by doing the following:

Remove the old databases (rm HEPS.*)
Recreating the new ones (UpdateDb HEPS)

After you have done this, a new set of files will appear which will be named exactly as the old ones were, but with a later timestamp. Should this not be the case, something has gone wrong.

Troubleshooting

Check consistancy in syntax with the db_list command.
- "db_list < file >.HD.DBF"
- "db_list < file >.I.DBF"
Check UNIX file access rights. They should be "rw-rw-rw".
- "ls -l *.DBF"