Frequently-Asked Questions about data terminology
What is a data set? Is it the same as a data file?A data set is a catch-all phrase that covers anything related to data. It includes raw and processed data, grids, images, maps, data spreadsheets and tables, and so on. Often, a data set comprises a suite of data files collected or generated by one instrument or device. For example, a multibeam bathymetry data set may contain hundreds of individual swath data files collected from a sonar device during one cruise. If that collection of swath files undergoes data processing, it will be classed as a separate, processed data set to distinguish it from the raw data.
What is the difference between raw and processed data?Raw data refers to data that have not been changed since acquisition. Examples include the original swath files generated from a sonar system, a real-time GPS-encoded navigation file, and the initial time-series file of temperature values from a heat probe. Editing, cleaning or modifying the raw data results in processed data. For example, raw multibeam data files can be processed to remove outliers and to correct sound velocity errors. The resultant files are considered processed. Similarly, a bathymetric grid generated from either raw or processed swath files is a processed data product.
What is the difference between data curated at IEDA and data curated at other repositories?Data sets archived in the IEDA:MGDS facility generally fall into two categories: Those that have undergone some level of processing or those for which no national repository currently exists. Note that there may be rare instances of redundancy in data holdings. Usually, this resulted from a data set being sent to IEDA:MGDS for archiving before being sent to a national repository. When identified, the duplication is eliminated, leaving a link to the data served by the national repository. In rarer cases still, IEDA:MGDS may serve a full data set where a national repository archives only a subset.
We welcome your feedback
We encourage input to help us improve these pages. Please contact us with suggestions.