IEDA: Marine Geoscience Data System IEDA

MGDS Data Model Description

Overview & Design Guidelines

The MGDS Relational database backend was designed as a fully integrated system to handle a suite of marine geoscience data while also serving the specific needs of the individual science communities that it servces. It not only provides access to data files of a variety of formats and data types, but it enables data discovery by providing a comprehensive set of metadata including information about field operations and data acquisition, as well as data provenance (linking derived data to original raw data files), and scientific publications. In addition, available information about funding source(s), related projects, related data and information at external websites is also provided.


Hierarchical Data Model

The MGDS Relational Database follows a hierarchical data model with each data object belonging to a data set, which in term belongs to an "entry" (Expedition or Compilation). This basic categorization of data is intended to provide full access to relevant information to help ensure that data acquired by a set of investigators is sufficiently documented for re-use by investigators who did not participate in data acquisition. An Expedition can be used to describe a variety of field operations, for example, a Research Cruise, Ship Transit, or Land-based Campaign. Compilations, by contrast, are data synthesis efforts from multiple field programs, and/or laboratory or modeling efforts. Each Expedition/Compilation is tied to a single Chief Scientist to whom download statistics are automatically sent on a bi-annual basis. Additional metadata that is cataloged to describe Expeditions/Compilations includes: Geographic/Temporal Bounds, Technical Reports, Scientific References, Data Management Plans, Related Web Pages, Funding Information, Participants and their Affiliations.

Data Sets are the basic organizational structure for collections of data objects, based generally on device and data type. Each data set is tied to specific investigators, references, and descriptive metadata including data acquisition parameters including survey lines, stations, and launches of daughter platforms. Data sets can include links to remote data centers.

Data Objects can be classified into 3 types: Data files (e.g. raw data generated by sensor, images, edited files, spreadsheets, text files, processed data including grids, shapefiles, visualizations), Observations (e.g. temperature at a sampling location), and Samples. For each data object, we catalog basic metadata (file name, format, size etc), spatial/temporal information (optional, but recommended), and URLs to remote data centers (e.g. GenBank, SESAR). The database includes custom child tables for accommodating metadata for special file types (e.g. image, seismic image, grid, geotiff, seismic, swath).




This diagram is a simplified version of the MGDS Database Schema. The hierarchical model described above is represented on the left side of this image, which each data object belonging to a data set, which in turn belongs to an "entry" (Expedition/Compilation). Events and Event Sets are used to describe additional relevant information including Launches (information about daughter platforms including dive vehicles, planes, and/or small boats), Stations (location of multiple physical or digital sampling events over a space/time continuum), or Lines (to descibe towed vehicle and/or ship operations that perform surveys along planned survey lines).



Where possible, geospatial and temporal information is recorded in the database to facilitate data discovery and time series analysis. This includes for Expeditions (cruise track, bounding box, start/end dates), Compilations (bounding box, start/end dates), as well as data/event objects and sets. This information is used to enable search functionality not only by allowing the user to search within a geographic area, but also by relating data objects to named features (e.g. focus sites, physiographic features, vent structures).


Data Services

The metadata contained within the MGDS Database enables both simple and complex search functionality. It also enables Web Services to facilitate data discovery and interoperability, KML services, and Data Compliance Reporting.