Full text loading...
Abstract
Environmental informatics uses large multidimensional, complex datasets to study environmental problems, which can be both discrete and continuous in space or time. These datasets and their requisite metadata can be managed by queryable databases. Geospatial Web application programming interfaces (APIs) provide remote access to dynamic subsets of environmental information. Persistent identifiers make data citable. The storage-computing trade-off is now heavily skewed in favor of moving calculations to the data. Provenance metadata help determine a data object's reliability and trustworthiness.
Rising atmospheric CO2, the Antarctic ozone hole, and Gulf Stream warm-core rings were all discovered by analyzing long-term datasets. Similar work continues on mapping evapotranspiration and snow water equivalent. In these “fourth paradigm” problems, data (especially data collected operationally) drive hypothesis formation. Making data available requires new discovery mechanisms and policies favoring data sharing. Cloud computing and array-friendly databases will help bring processing to the data. Ubiquitous location sensing and geotagging will help turn citizen scientists into environmental information collectors.