NO2, PM25, and O3 daily 1km x 1km grid predictions for the U.S.

 

Daily, Monthly, and Annual PM2.5 Concentrations for the Contiguous United States, 1-km Grid (2000 – 2016)

This dataset includes daily predictions of ambient PM2.5 across the contiguous U.S. from 2000 to 2016. These predictions were produced by a geographically-weighted ensemble model that combined predictions from fitted neural network, random forest, and gradient boosting machine learners. The overall 10-fold cross-validated R2 values were 0.86 for daily predictions and 0.89 for annual predictions.

Daily predictions and monthly and yearly aggregates are available in RDS and plaintext formats, and the prediction grid is available as CSV and GeoPackage files. Example R code is provided to facilitate reading and common merge operations.

When using this dataset, please also cite the Related Publication below, which further details the data sources and processes used to produce these predictions.

This is a reupload and slight format change of data that was previously available on SEDAC / EarthData under the name "Daily and Annual PM2.5 Concentrations for the Contiguous United States, 1-km Grids, v1 (2000 – 2016)".

Link to Dataset: dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/58C6HG

 

Daily, Monthly, and Annual NO2 Concentrations for the Contiguous United States, 1-km Grid (2000 – 2016)

This dataset includes daily predictions of ambient NO2 across the contiguous U.S. from 2000 to 2016. These predictions were produced by a geographically-weighted ensemble model that combined predictions from fitted neural network, random forest, and gradient boosting machine learners. The overall 10-fold cross-validated R2 values were 0.79 for daily predictions and 0.84 for annual predictions.

Daily predictions and monthly and yearly aggregates are available in RDS and plaintext formats, and the prediction grid is available as CSV and GeoPackage files. Example R code is provided to facilitate reading and common merge operations.

When using this dataset, please also cite the Related Publication below, which further details the data sources and processes used to produce these predictions.

This is a reupload and slight format change of data that was previously available on SEDAC / EarthData under the name "Daily and Annual NO2 Concentrations for the Contiguous United States, 1-km Grids, v1.10 (2000 – 2016)".

Link to Dataset: dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/LUFKYG

 

Daily, Monthly, and Annual 8-Hour Maximum O3 Concentrations for the Contiguous United States, 1-km Grid (2000 – 2016)

This dataset includes daily predictions of 8-hour maximum ambient O3 across the contiguous U.S. from 2000 to 2016. These predictions were produced by a geographically-weighted ensemble model that combined predictions from fitted neural network, random forest, and gradient boosting machine learners. The overall 10-fold cross-validated R2 values were 0.90 for daily predictions and 0.86 for annual predictions.

Daily predictions and monthly and yearly aggregates are available in RDS and plaintext formats, and the prediction grid is available as CSV and GeoPackage files. Example R code is provided to facilitate reading and common merge operations.

When using this dataset, please also cite the Related Publication below, which further details the data sources and processes used to produce these predictions.

This is a reupload and slight format change of data that was previously available on SEDAC / EarthData under the name "Daily and Annual 8-Hour Maximum O3 Concentrations for the Contiguous United States, 1-km Grids, v1.10 (2000 – 2016)".

Link to Dataset: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/DGXCTH 



Annual PM2.5 Component Concentrations for the Contiguous United States, 50-m Urban Grid and 1-km Non-Urban Grid (2000-2019)

This dataset includes annual predictions of PM2.5 components across the contiguous U.S from 2000 to 2019. Predictions are made on a 50-m grid within urban areas, as defined by the U.S. Census Bureau 2010 TIGER/Line Shapefiles, and on a 1-km outside of those areas.

These predictions were produced by a set of super-learning ensemble models in which the predictions from a variety of machine learners were then used to train a second stage of machine learners. For each component-urbanicity combination, the second-stage machine learner with the highest R2 was chosen as the final model. Overall cross-validated R2 values ranged from 0.856-0.952 for the major components (EC, NH4+, NO3-, OC, and SO42-) in urban areas and 0.878-0.957 outside urban areas, and 0.797-0.878 for the trace elements (Br, Ca, Cu, Fe, K, Ni, Pb, Si, V, and Zn) in urban areas and 0.787-0.881 outside urban areas. Please refer to the README file for the exact values for each element and location.

Predictions are available in RDS and plaintext formats, and the prediction grid is available as CSV and GeoPackage files. Example R code is provided to facilitate reading and common merge operations.

When using this dataset, please also cite the Related Publications in the Metadata tab below (also available in the README file), which further detail the data sources and processes used to produce these predictions.

This is a reupload and format change of data that was previously available on SEDAC / EarthData under the names "Annual Mean PM2.5 Components (EC, NH4, NO3, OC, SO4) 50m Urban and 1km Non-Urban Area Grids for Contiguous U.S., v1 (2000 – 2019)" and "Annual Mean PM2.5 Components Trace Elements (TEs) 50m Urban and 1km Non-Urban Area Grids for Contiguous U.S., v1 (2000 – 2019)".


Link to Dataset: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/3H7DNP