The Electronic Geophysical Year: 2007-2008

Working Group on Best Practices


Terms of Reference

Background:

The rapid growth in the volume of scientific and technical data has led to unprecedented challenges in data and information management. These data have also led to unprecedented opportunities to gain new understanding of geophysical and related economic and social processes. Cross-disciplinary research is a key to understanding the causes and consequences of environmental phenomena, thus it has become necessary to integrate data and information across scientific disciplines, platforms, and instruments. Data and information are being used in new ways and by a variety of new users. Scientists who have expertise in one discipline are using data from other disciplines. In addition, society at large is finding greater value in the application of complex data and derived information for navigation, hazard mitigation, sustainability, and many other uses.

Data management systems and processes continually evolve to address these new applications and user requirements; yet significant challenges remain to adequately archive, describe, and distribute the ever-increasing volume and complexity of data. Groups of scientists, data managers and technologists are working together to address some of the challenges and they recommend standards and best practices to use. These standards and practices focus primarily on data archiving, basic cataloging, distribution, and visualization. Less work has been done to describe the best practices for managing environmental data in ways that facilitate data integration and greater scientific understanding. This Working Group focuses on the Best Practices for the management of environmental data for all users especially those requiring inter-disciplinary data.

Objectives:

The eGY Best Practices Working Group will document a set of best practices that support electronic data stewardship, facilitate electronic data access, promote cross- discipline use and increase usability. A central goal will be to promote a set of best practices that support the extraction of information and knowledge from disparate data systems. The scope of the activity will cover electronic data stewardship, data access, metadata, cross-discipline exchanges, data mining and knowledge extraction.

The working group membership comprised of scientists, technologists, and data managers who have created and deployed the recognized premier data systems in different disciplines. The group will distill from the system designs, developers' experiences and users' experiences a set of common practices that best meets community needs and requirements. We will also use lessons learned from other groups to create a map of potential trouble spots or design limitations. We will describe proven practices and methodologies that help address the following data management functions:

  • Ease of use of environmental data and information,
  • Greater data interoperability for automated applications,
  • Required descriptions of data quality and uncertainties,
  • Increased data availability,
  • Integrated data visualization, intercomparison, and pattern recognition,
  • Consistent access, presentation, and support across multiple repositories,
  • Appropriate data attribution and accountability, and
  • Stewardship of electronic archives.

Since technology constantly evolves quickly and often in unpredictable ways, the practices described should be independent of one technology. In describing these best practices we will define a specific vocabulary that is consistent with other standards and best practices groups to describe relevant terms such as access, availability, integration, data mining, and knowledge discovery.

We will publish initial results in appropriate journals and solicit broad public comment.

Caveats:

We will not describe practices for data collection. We will not describe specific standards, these are adequately addressed by other bodies, e.g. the International Standards Organization. We are not certifying or evaluating existing data systems or centers.