View Data dictionary
Welcome to website for the data dictionary for the CEC’s Online Interactive, Informational Platform on Climate Change (the “Platform”).
The data dictionary seeks to define a framework to enhance the comparability of air pollutant inventories. The class definitions included in this dictionary are designed to apply to GHG, BC, and CAC inventories, and give a researcher the human- and machine-readable information she would need to make detailed comparisons between inventories conducted for different purposes, with different sectoral, geographic, and temporal boundaries, or using different source categories, estimation methodologies, or emission factors.
By publishing and soliciting feedback for the data dictionary online, we allow users to:
- view and respond to updates to the data dictionary in real time;
- give feedback in a format where officials can easily view and respond to each other’s questions, comments, and concerns; and
- interact with the data dictionary using the medium in which it will one day be published for public consumption.
We are also optimistic that users will find it easier to offer comments on the online data dictionary, and that we may more easily reach a larger group of stakeholders to offer feedback.
After reading this homepage, which explains the context of the data dictionary within the larger Platform project, you may wish to navigate to the Classes section, which contains detailed class definitions, or the Representations section, which contains object representations for source categories taken from specific inventories. “Paper” versions of the data dictionary are also available below in .doc, .pdf, and .pages formats. Officials are welcome to give feedback using the paper version and track changes.
The Platform: Timely, Relevant, Vetted, and Comparable Information
The CEC’s Online Interactive, Informational Platform on Climate Change seeks to provide government officials access to information relating to climate change that is “up-to-date” as well as “scientifically-based, regionally-relevant, and vetted internationally” (CEC Project Summaries 2011-12, 27-8). By building a repository of timely, relevant, and vetted information, the CEC addresses a challenge inherent in managing the rapidly changing and technically complex information space relating to climate change.
As currently planned, the Platform will meet the goal of providing government officials and other professionals timely, relevant, and vetted information by publishing data sourced from the authoritative GHG, BC, and CAC inventories of each CEC member government. To make these disparate inventories accessible together within a single system, the Platform must facilitate dataset comparability by including structured metadata relating to source categories, estimation methodologies, and emission factors—among many other data points.
In its December 2012 RFP, the CEC mandated the Platform adopt “a new, dynamic approach to enable data and information exchange, in part through the development and incorporation of cutting-edge semantic web and visualization tools,” and facilitate “comparability among data and analyses,” specifically GHG, BC, and CAC inventories (RFP, 1).
Semantic frameworks and web services are the ideal tools for serving inventory data with structured metadata to promote dataset comparability. Moreover, they also align with the approach to dataset integration mandated by the CEC Council in Resolutions 97-04, and 01-05. These resolutions, which related to pollutant release and transfer and air pollutant datasets, directed that the CEC take action to promote dataset comparability and dissemination and at the same time acknowledged that each national government must maintain its own “unique process for the collection and modification of environmental data sets.” Semantic frameworks and web services provide the CEC with the means to make unique inventories and other datasets comparable without simplifying, truncating, or reducing the complex metadata officials need to appropriately compare information formed within different contexts.
The Data Dictionary: The Foundation for Platform Development
This data dictionary establishes the necessary foundation for the Platform’s efforts to compare inventory data in a non-reductive manner. In defining a structure common to GHG, BC, and CAC inventories, the data dictionary will aid the future development of the Platform in the following four practical ways:
- The data dictionary will make the Platform more accessible to its users, offering clear, detailed, human-readable data definitions that will allow users with little knowledge of web technologies to make informed comparisons of complex inventory data.
- The classes and properties defined by the data dictionary will inform the structure of the JSON and XML-based web service queries and responses the Platform will serve. Web services will make inventory datasets more useful by enabling third-parties to build a diverse array of applications and reports that access inventory data.
- The classes and properties defined by the data dictionary will inform the structure of an authoritative RDF vocabulary associated with the Platform, and the inclusion of RDFa semantic tags with Platform data. RDF is the key enabling technology of the semantic web, and will allow air pollutant inventory publishers to apply a common, machine-readable vocabulary to their own inventories, dramatically improving the comparability of inventory data.
- The classes and properties defined by the data dictionary will inform the structure of the Platform’s SQL database, as well as its business-logic and presentation layers.
Because the data dictionary has been drafted with multiple purposes in mind, it does not adhere precisely to established conventions relating to RDF or SQL data models, or web service documentation. For instance, class definitions do not include the ID fields one would expect to find in a model of an SQL table, nor does the data dictionary contain visualizations of RDF graphs or examples of XML esponses to web service queries. Instead, the data dictionary reflects a generalized approach to conceptual modeling, pairing UML class diagrams with detailed textual definitions of classes and properties. These general, human-readable class definitions form the basis for the technical, machine-readable RDF and web service definitions that will be published at the launch of the Platform.
Outreach and Feedback
Outreach to government officials and other stakeholders will remain a top priority during the second stage of this project (July - December, 2013). Government officials and others who are invited to give feedback on the data dictionary are encouraged to give feedback by posting comments, or by editing the “Talk Page” that is attached to each page of the dictionary (see link at top right of page). For detailed instructions on how to give feedback online, please see the Documentation page.
As of July 2013, no date has been set for a public release of the data dictionary, and it is probable that the data dictionary will be released at the time of the launch of the Platform, no sooner than December 2013.
One issue we will address during the second stage of the project is aligning class and property names and definitions with terms used in vocabularies that officials with CEC member governments have participated in defining. These include the NetCDF Climate and Forecast Metadata Convention (CF Metadata), NASA’s Global Change Master Directory, and the CEC’s North American Environmental Atlas Data Dictionary and metadata definitions.
While government officials provided excellent feedback to the draft data dictionary during the first stage of this project (April - June 2013), many officials were understandably hesitant to comment given that doing so required familiarity with conceptual data modeling. The second stage of the project will provide an easier opportunity for officials to give feedback the Platform as a whole, as the data dictionary is now accompanied by a wireframes document and a functional beta version of the Platform should launch no later than November 2013.
To see what recent changes have been made to the data dictionary, see the Summary of Changes page, which contains written notes on recent changes made to the system, or the Recent Changes page, which logs every edit and comment posted to the system. The Open Organize team will frequently check the Recent Changes page to discover and respond to comments and Talk Page commentary posted throughout the system.
Please don’t hesitate to contact the consultant team from Open Organize directly if you want to create new user accounts for the wiki, would like to schedule a training or are experiencing technical issues, or would like to set up a conference call or in-person meeting with the Open Organize team. Below are the three primary points of contact for the project.