Metadata harvesters

Project-level Metadata

The Project-level Metadata level is related to the main CAGS interface website from which metadata are harvested through the online submission form. It collects metadata (derived and expended from Dublin Core metadata) which incorporates a number of descriptive and resource discovery focused elements describing either a scientific communication, a notebook and a dataset independently or jointly.

File-level Metadata

The File-level Metadata is a gui designed to help with the initial preparation of one geophysical dataset. Starting from one or multiple input directories, a cleanly structured output directory is generated (without deleting any input files). Generate suitable metadata from user input and write this metadata into the directory structure, making it ready for further distribution.

geophysical Metadata Management using a Juypter Notebook

`geophysical Metadata Management using a Juypter Notebook <https://github.com/m-weigand/geometadp.git>`_

Fig. 1 Geophysical Metadata Management using a Juypter Notebook <https://github.com/m-weigand/geometadp.git>

Data management

Requirement to publish to the CAGS

Data package linter

Workflow for preparing a dataset

  1. Sort data into subdirectories.
  2. Write an ASCII ‘Readme’ file which states the basic characteristics of the datasets (settings, procedures, etc);
  3. Identify the used data formats and check if they are documented/importable by available free software;
  4. Formulate a JSON formatted metadata file following one of our metadata standards using geometadp
  5. Upload the dataset to a data repository;
  6. Register with CAGS (using the form or by opening a tasks tracker on the github repository e.g. via github-issue).