exportIDFS is a program which is used to extract and export data that has been stored in the Instrument Description File System (IDFS) format. The IDFS format is a data storage format that is designed to be general enough to handle the majority of scientific data sets. These data sets include raw telemetry, processed data, simulation data and theoretical data. IDFS data sources are defined as either scalar instruments or vector instruments. A scalar instrument returns singular data quantities that are dependent only upon time and position. A vector instrument returns one-dimensional data quantities that have a functional dependence on a single variable, which in IDFS terminology is called the scanning variable.
The exportIDFS program can be invoked in one of two modes:
(1) interactive mode or (2) batch mode. In interactive mode, the program
utilizes a GUI-based definition session to define the data items to be
exported. Once this definition session has been completed, the selected data
parameters can then be exported to the selected file format. To invoke the
program in interactive mode, type exportIDFS
at the command line.
In batch mode, the interactive GUI-based definition session is bypassed and
the data requested is immediately exported based upon information contained in
the named layout file. To invoke the program in batch mode, type
exportIDFS -FName filename
at the command line. The argument
filename is the name of the layout file that is to be
utilized during the current export session. Note that the name of the
layout file does not include the .EXP extension, which is appended to the
filename provided by the user during the GUI-based definition session. If the
named layout file does not exist, an error is displayed and processing terminates.
For a complete list of arguments that can be utilized by SDDAS applications
that support batch mode processing, the user is referred to the
SDDAS Applications Batch Interface document.
The user should be aware that the exportIDFS program
utilizes only the layout filename, beginning time, ending time and graphics
device number command line options. If any of the other command line options
are specified, a message is displayed stating that the specified option is not
utilized for the exporting of the data for the current export session. The
exportIDFS program utilizes the graphics device number
command line option differently than the other SDDAS applications since
there is no graphics output associated with exportIDFS.
The exportIDFS program utilizes the graphics device number
command line option to allow for the selection of the file format to which the
data is to be exported (CDF, netCDF, IDFS or XML).
In order to export IDFS data, an IDFS source must first be selected. This is achieved by selecting the "Data Source" button. Once a valid IDFS source has been selected, the "Data Items" button becomes visible. At this point, the data items to be exported from the selected IDFS source must be defined. At least one data item must be selected; otherwise, an error message will be displayed when the "Export" action is selected.
In most cases, the data items to be exported are referred to as IDFS sensors. An IDFS sensor is defined as a primary data source returned by the virtual instrument in question. However, in some cases, the data items may be SCF output variables. The SCF (Science Computation Formulation) system provides for the creation of new data products from an existing primary data set (IDFS). In some cases, these derived products may be dependent upon values returned from a single instrument; in other cases, the derived products are dependent upon values taken from many instruments. For a more in-depth explanation of the SCF system, the user is referred to the paper entitled "The Science Computation Formulation System".
When the "Data Items" button is selected, the Data Sources GUI is displayed. On this GUI resides a list which indicates the data items to be exported. Initially, this list is empty. To add a data item to the list, the pull-down Insertion menu is utilized. Once the position for the data item to be added has been determined, the actual data item must be defined using the Data Attributes GUI. This GUI is automatically invoked when a new item is added to the list. The parameter selection is defaulted to the first data item defined for the IDFS source based upon information contained in the PIDF file. If changes to any of the attributes for a specific data item need to be made, the data item should be selected from the list and the "Attributes" button should be activated in order to invoke the Data Attributes GUI.
Once the data item has been selected, binning information must be provided for the data item. The binning information is defaulted based upon information contained in the PIDF file for the selected data item. For IDFS data items, all sensors are binned using the same binning scheme; therefore, the first data item definition on the list defines the binning scheme that will be used by all exported IDFS sensors. For SCF output variables, each data item is binned uniquely. Therefore, if three SCF output variables are selected for export, three unique binning schemes are utilized by the data acquisition software. The user need not concern themselves with this information unless a change in binning schemes is desired, which can be achieved by selecting the "Binning" button.
The exportIDFS program allows for the exportation of IDFS sensor or SCF output variable data, but not any combination of the two data sources. The first data item definition on the list determines if IDFS or SCF data products are to be exported. If both IDFS and SCF data items are needed, separate instances of the exportIDFS program must be run.
Once data items to be exported have been selected, the file format and averaging scheme for the data must be chosen. This is achieved by selecting the "Data Packaging" button. Currently, IDFS and SCF data can be exported to one of four file formats:
In addition to the file format, the user must also specify the averaging scheme to utilize for each exported data sample. The user may specify either a sample average or a time average. With a sample average, the user specifies the number of data samples (sweeps) to average together for each exported data sample. With a time average, the user specifies the amount of time to be acquired for each exported data sample. When time average is selected, the time specified is converted to sweeps using the maximum temporal resolution allowed by the selected virtual instrument. If the result is a non-integer value, the number of sweeps acquired is determined by adding the integer component with the ceiling of the accumulation of the fractional component. If no averaging is required, that is, if a sweep by sweep dump is to be performed, a sample average with number of samples set to one should be defined. This is the default scenario for the exportIDFS program.
Before the exportIDFS program can export the data, one final piece of information must be defined. This information is the time range for which the data is to be acquired. This is achieved by selecting the "Time" button.
Once the IDFS source, data items, file format and time range have been defined, the selected data items can then be exported to the selected file format. To export the data, select the pull-down Action menu from the main menubar and select the Export option. Upon activation, the local database is checked to see if the requested data files are online. If data for the requested time range is not online, the missing data is promoted to the local disk. Once the data has been placed online, the datafiles are opened, the data is extracted and exported to the selected file format. Data will continue to be processed until the user-requested end time has been reached or until an error condition is raised. When an error condition is encountered, a message is displayed, the partially created file is purged and processing terminates. Upon completion of the export task, successful or unsuccessful, any promoted IDFS data files are removed from the local disk.
When the exportIDFS program is run in interactive mode, a check is made to see if the file to be generated already exists in the current working directory. If it does, the user will be asked if they wish to overwrite the data file. If the user answers yes, the file is removed and an attempt is made to create a new file. If the user answers no, the current request for data exportation is aborted. When run in batch mode, no query is made; the file is removed and an attempt is made to create a new file.
Since the exportIDFS program has the potential to generate large data files, a clean-up mechanism is utilized. Whether or not the clean-up mechanism is invoked depends upon the actual user running the exportIDFS program. If there exists a ".guest" file in the user's home directory, the data file will be scheduled for removal 30 minutes after the data file has been closed. The user will be informed of this situation. If a ".guest" file does not exist in the user's home directory, the generated data files will be left untouched. This scheme was designed for those sites that set up a public guest account through which outside users are given access to the named local system. The contents of the ".guest" file is not important; simply, the existence of the file is utilized.
For IDFS sensor data, the exportIDFS program will export the selected data items, data quality information, any secondary data sources selected for exportation and the time range associated with each data sample. If the IDFS source returns instrument status values, this information is also exported. Instrument status values utilize a separate time tag, which will be written to the file. All of this information is considered record-variant since the values change from data sample to data sample. If the selected IDFS source is a vector-instrument, the scan values which correspond to the returned data bins are written to the file. The center scan values and the band-width values for each data bin are written once to the file since these values remain constant.
For SCF output variables, the exportIDFS program will export the selected data items along with the time range associated with each data sample. This information is considered record-variant since the values change from data sample to data sample. If the SCF output variable returns non-scalar data, the scan values which correspond to the returned data bins are written to the file. The center scan values and the band-width values for each data bin are written once to the file since these values remain constant. Unlike IDFS sensors. each SCF output variable can bin data uniquely. Therefore, if three non-scalar SCF output variables are selected for export, three unique binning schemes are utilized by the data acquisition software and thus, three sets of center and band-width values are written to the file.
Once all the information has been defined, the information may be saved to a layout file for future retrieval. This is achieved by selecting the pull-down File menu and selecting the Save As option. The information defined is not saved by the program unless the user explicitly does so. Note that when providing the name of the layout file, do not specify the .EXP extension. The exportIDFS program automatically appends the .EXP extension to the name of the layout file upon creation of the file.
Due to the limitations / restrictions of the various formats, the following conventions are followed:
Instrument status values, or MODE data, are pertinent to the instrument as a whole, not to any one sensor definition. For netCDF, the naming convention utilized for the instrument status values is MODEx, where x represents the mode definition number, starting with zero. This convention was selected since a mapping variable is provided for each instrument status value defined. This mapping variable is an array of ASCII strings that describe what the value for the mode represents. There should be one definition for each possible value for the mode (3 bits = 8 definitions). For example, MODE1 is a status value defined to have two states - 0 and 1. There is also a mapping variable called MODE1_key, which has 2 entries, "Low Bias" and "High Bias". Therefore, when MODE1 returns a value of 0, the instrument is in Low Bias mode. It was decided that it would be easier to match numbers than it would be to match names since the user would first have to determine what the names were for each of the instrument status values.
For CDF, the naming convention utilized for the instrument status values is MODEx_descriptive name, where x represents the mode definition number, starting with zero. This convention was selected since a mapping global attribute is provided for each instrument status value defined. This mapping variable is an array of ASCII strings that describe what the value for the mode represents. There should be one definition for each possible value for the mode (3 bits = 8 definitions). For example, "MODE1_Retard Sweep Range" is the second status value (MODE1) defined for the IDFS data source of interest. The name defined for this instrument status value is "Retard Sweep Range". This instrument status value has two defined states - 0 and 1. There is also a mapping variable called "MODE1_KEY", which has 2 entries, "Low Bias" and "High Bias". Therefore, when "MODE1_Retard Sweep Range" returns a value of 0, the instrument is in Low Bias mode. It was decided that it would be easier to match numbers (MODEx_) than it would be to match names since the user would first have to determine what the names were for each of the instrument status values.
Exporting data to IDFS resulted in the most descriptive names since the file is simply ASCII text - a dump of labels and values. The variable names are outputted as they are defined for the IDFS and SCF data products. For example, "Retard/Pk 3" is the variable name assigned to the first data item selected for exportation, "Data Quality" is the variable name for the data quality value for the first data item selected, and "Scan Voltage Steps (Cal.)" is the variable name for the calibration data associated with the first data item selected. The label (Cal.) is appended to better identify the source of the data product. The instrument status variable name "Retard Sweep Range" is outputted to the ASCII file. In addition, there are variables reported to indicate the number of selected IDFS data items, the number of calibration sets defined and the number of instrument status values defined in order to process the data in a self-describing way.
All of the data blocks identified above may or may not be contained within the XML file created, based upon the IDFS data source selected. For scalar IDFS data sources, there is no scan information; therefore, the Scan Block information is not pertinent and is not included in the XML file generated. If the IDFS data source does not define any data quality or instrument status values in the PIDF file, there is no Data Quality or Mode information to be written to the XML file. The Pitch Angle, Start Azimuthal Angle, Stop Azimuthal Angle and Calibration blocks pertain to secondary data sources and therefore are written to the XML file if the secondary data source is applicable for the selected IDFS data source and if the user selected the secondary data source for exportation.
When the user selects XML as the file format for SCF data items, the file generated is simply an ASCII file which contains the selected SCF data parameters, all identified using XML tags. The data is basically blocked or grouped together in the following manner:
All of the data blocks identified above may or may not be contained within the XML file created, based upon the SCF output variables selected. Unlike IDFS data sources which are uniform in rank, SCF output variables can be a mixture of scalar and multi-dimensional data (1-D up to 10-D). If the selected SCF output variable has a scan variable associated with it, the Scan Block information is included in the XML file generated and a Scan Index value is placed within the Data Item block to link the data with the scan information. This is done for each SCF output variable that has scan information defined; therefore, there may be multiple Scan Blocks contained within the XML file.
The following table identifies the tags which are utilized by the exportIDFS program for the XML file format option:
XML Tag | Pertinent to IDFS or SCF |
Meaning |
---|---|---|
Idfs_Parameters | IDFS | token which identifies the data as IDFS data items(s) |
Scf_Parameters | SCF | token which identifies the data as SCF output variable(s) |
Scan | IDFS and SCF | token which groups together information that is associated with the scan variable for the data items(s) being exported |
Scan_Unit | IDFS and SCF | token which describes the units that the values are expressed in for the scan variable |
Scan_Length | IDFS and SCF | token which defines the number of values returned for the scan variable |
Center_Scan | IDFS and SCF | token which identifies the center scan values associated with the data items being exported |
Scan_Low | IDFS and SCF | token which identifies the lower scan edge values for the scan range associated with the data items being exported |
Scan_High | IDFS and SCF | token which identifies the upper scan edge values for the scan range associated with the data items being exported |
Scan_Block_Index | SCF | token which represents a scan block identifier number. This number is used to link the exported SCF output variable(s) with any scan information pertinent to the data item in question. |
Data_Set | IDFS and SCF | token which groups together information that is associated with each exported data sample |
Number | IDFS and SCF | the exported data sample number, with numbering starting at zero (like a record counter) |
Start_Time | IDFS and SCF | token which defines the start time for the exported data sample |
Stop_Time | IDFS and SCF | token which defines the stop time for the exported data sample |
Data_Item | SCF | token which groups together information that pertains to each selected SCF output variable |
Scan_Index | SCF | token which represents an index value (link) to the scan block information that is pertinent to the SCF output variable named in the Data_Item block in which the token appears |
Sensor | IDFS | token which groups together information that pertains to each selected IDFS data item |
Data_Quality | IDFS | token which identifies the data as the data quality value associated with the Sensor named in the Sensor block in which the token appears |
Start_Azimuthal_Angle | IDFS | token which identifies the data as the start azimuthal angle data associated with the Sensor named in the Sensor block in which the token appears. The start azimuthal angle values are always returned as values between 0 and 360 degrees. |
Stop_Azimuthal_Angle | IDFS | token which identifies the data as the stop azimuthal angle data associated with the Sensor named in the Sensor block in which the token appears. The stop azimuthal angle values could be negative or could be greater than 360 degrees. The stop azimuthal angle values are computed by adding the degrees covered by the accumulation time of each sample to the start azimuthal angle values. |
Pitch_Angle | IDFS | token which identifies the data as the pitch angle data associated with the Sensor named in the Sensor block in which the token appears |
Calibration | IDFS | token which identifies the data as the calibration data associated with the Sensor named in the Sensor block in which the token appears. Unlike Data_Quality, Start_Azimuthal_Angle, Stop_Azimuthal_Angle, and Pitch_Angle, there will be one Calibration block defined for each calibration data set defined for the virtual instrument (IDFS data source) in question. |
Mode_Start_Time | IDFS | token which defines the start time for the instrument status data associated with the exported data sample |
Mode_Stop_Time | IDFS | token which defines the stop time for the instrument status data associated with the exported data sample |
Mode | IDFS | token which identifies the data as the instrument status or mode data. The instrument status data is defined for the virtual instrument (IDFS data source) in question; therefore, this data type is not associated with any particular sensor. |
Name | IDFS and SCF | token which identifies or gives a name to the data parameter being exported |
Unit | IDFS and SCF | token which describes the units that the data values are expressed in for the data parameter being exported |
Data_Length | IDFS and SCF | token which defines the number of data values returned for the data parameter being exported |
Values | IDFS and SCF | token which identifies the actual data values that are being returned for the data parameter being exported |
Two XSLT stylesheets have been developed as examples in extracting the data from the xml formatted file. Both examples generate html code to display the data in tabularized format. The first stylesheet entitled IDFS.xsl can be used to process exported IDFS sensor data. The second stylesheet entitled SCF.xsl can be used to process exported SCF output variables.
To test these two stylesheets, an XSLT processor was needed. The principal role of an XSLT processor is to apply an XSLT stylesheet to an XML source file and produce a result "document". The XSLT processor utilized for the testing of the stylesheets was Saxon. Saxon is an open source XSLT processor developed by Michael Kay. It is a Java application, and can be run directly from the command prompt; no web server or browser is required. The html source generated by the stylesheets created is simply directed to standard out by Saxon. At the command line, standard output was re-directed to a file and that file was viewed through a browser for validation. The user is referred to the write-up for Instant Saxon for more information on this XSLT processor.
The remainder of this document gives an in-depth explanation of the options that appear on the various GUIs utilized by the exportIDFS program.
The user must select a project, satellite, experiment, instrument and virtual instrument from which data is to be extracted and exported. To change any of the selected options, click on the buttons on the right hand side. Note that all lineage information under the branch being changed is no longer applicable and must be re-selected. When the IDFS data source is changed, any previous data item definitions are deleted from the list and must be re-defined.
To add a data item to the list, the pull-down Insertion menu is utilized. The menu options indicate the position within the list at which the current data item definition is to be inserted. These options include:
To delete a data item from the list, the pull-down Removal menu is utilized. Currently, this pull-down menu contains just one option
The "Attributes" button invokes the Data Attributes GUI. The "Binning" button invokes the Bins GUI.
In some cases, there may be only one data unit defined. In other cases, a list
of data units will be presented. In either case, the Data Units
option is defaulted to the last data unit defined for the selected data item.
Data quality flags and the variables which describe the instrument state
are automatically exported, if the PIDF defines these data parameters.
Based upon the IDFS source selected, the last three items listed may or
may not apply. If they do apply, the user can select any or all of these
items for exportation. The default is set so that none of these last three
secondary data sources are returned; in other words, the user must "check"
the box in order to include these secondary data sources in the current
export session.
The IDFS file format is simply an ASCII file which contains the selected data parameters. The exportIDFS program utilizes CDF version 2.6 and netCDF version 2.4 software. When a file is exported to the CDF file format, a file with a ".cdf" extension is created. When a file is exported to the netCDF file format, a file with a ".nc" extension is created. When a file is exported to the IDFS file format, a file with a ".idfs" extension is created. When a file is exported to the XML file format, a file with a ".xml" extension is created.
Data Acquisition - defines the averaging scheme used for each exported data sample.
Time - the user specifies the amount of time used to average the data for each exported data sample.
No. Of Samples - defines the number of data samples (sweeps) to average together for each exported data sample.
Input Variable Controller - defines which input variable to use when determining the amount of time to be processed for each iteration of the SCF algorithm.
Time Interval - the amount of time to be acquired for each exported data sample.
This widget is only displayed if the user selected Time as
the Data Acquisition option. The value can be expressed in one
of four time units:
When the CDF file format is selected, a CDF file is created which contains the requested data items and meta data. The meta data is comprised of global-scope attributes that provide information about the data set as an entity. Some of the required global attributes have been selected for potential modification by the user. The values for these global-scope attributes are defaulted by the exportIDFS program. The user need not concern themselves with this information unless a change in the meta data is desired. A brief explanation of the options is given below. In all cases where a list is utilized, the list of options that are selectable are defined according to CDF documentation.
In order to set the time values, enter the values in the boxes that appear next to the time component being set or use the increment / decrement arrows. The stop time must be greater than the start time. The time is initially set to the current time. By Julian convention, January 1 is day 1.
Exports the selected IDFS data items or SCF output variables for the selected time range.