{
    "created": "2024-11-25 17:24:58",
    "updated": "2026-05-06 09:01:00",
    "id": "a0dfbebf-3b25-460d-8a67-4bf244d6bb71",
    "version": 7,
    "ds_topic": null,
    "title_cn": "中国每日地面MDA8 O3重建数据（2005-2021年）",
    "title_en": "Daily Ground MDA8 O3 Reconstruction Data in China (2005-2011)",
    "ds_abstract": "<p>&emsp;&emsp;伴随着 PM2.5的持续下降，O<sub>3</sub>污染问题日益突出，中国政府已将其作为保护气候、生态系统和人类健康的目标。虽然卫星对大气臭氧气柱的检索已经运行了几十年，中国也从2013 年开始在全国范围内对O<sub>3</sub>进行监测，但O<sub>3</sub>的气候学变异性仍然未知，这阻碍了对中国O<sub>3</sub>的长期驱动因素和影响的了解。在此，我们建立了一个极端梯度增强（XGBoost）模型，该模型整合了高分辨率气象数据、卫星痕量气体检索数据等，提供了2005-2021年中国每日地面O<sub>3</sub>的重建数据。模型验证证实了该数据集的稳健性，基于样本的交叉验证的 R<sup>2</sup>为 0.89。来自城市、农村和背景站点的独立历史观测数据也证实了长期变化的准确性。数据集以 0.1<sup>°</sup>×0.1<sup>°</sup>的无间隙网格覆盖了 2005-2021 年的长时段，有助于气候、生态和健康研究。</p>",
    "ds_source": "<p>&emsp;&emsp;本研究中 2013-2021年中国大陆地面臭氧小时观测数据来自中国国家环境监测中心网络。从 2013 年的约 900 个监测站开始，到 2021 年的约 1600 个监测站。我们剔除了O<sub>3</sub>负值，然后计算了每个监测点的O<sub>3</sub>日最大 8 小时平均浓度（MDA8）。由于对流层中 O<sub>3</sub>的丰度受排放（人为和自然）和气象条件的影响，将气象变量、人为排放清单、海拔高度、土地利用、归一化差异植被指数（NDVI）等作为机器学习模型的输入变量。</p>",
    "ds_process_way": "<p>&emsp;&emsp;我们采用极端梯度提升（XGBoost）算法，利用一组相关的预测变量来预测地面臭氧（O<sub>3</sub>）浓度。XGBoost 是一种基于梯度树提升的高效机器学习算法，已被广泛应用于许多任务中。此前，我们曾采用它来纠正化学传输模型的系统偏差。XGBoost 是集合学习技术之一，它结合多个弱模型（如决策树）生成一个强模型，以获得更好的性能。集合学习中的组合方式。为了评估模型在估算每日 MDA8 O<sub>3</sub>浓度方面的整体性能，我们采用了基于样本（样本外）和基于站点（站点外）的 10 倍交叉验证（CV）方法。</p>",
    "ds_quality": "<p>&emsp;&emsp;数据质量良好。</p>",
    "ds_acq_start_time": "2005-01-01 00:00:00",
    "ds_acq_end_time": "2021-12-31 00:00:00",
    "ds_acq_place": "中国",
    "ds_acq_lon_east": null,
    "ds_acq_lat_south": null,
    "ds_acq_lon_west": null,
    "ds_acq_lat_north": null,
    "ds_acq_alt_low": null,
    "ds_acq_alt_high": null,
    "ds_share_type": "open-access",
    "ds_total_size": 105680999,
    "ds_files_count": 3,
    "ds_format": "nc",
    "ds_space_res": null,
    "ds_time_res": "月、年",
    "ds_coordinate": "无",
    "ds_projection": "",
    "ds_thumbnail": "a0dfbebf-3b25-460d-8a67-4bf244d6bb71.png",
    "ds_thumb_from": 0,
    "ds_ref_way": "",
    "paper_ref_way": "",
    "ds_ref_instruction": "用户在使用数据时请在正文中明确声明数据的来源，并在参考文献部分引用本元数据提供的引用方式。",
    "ds_from_station": null,
    "organization_id": "0a4269e1-65f4-45f1-aeba-88ea3068eebf",
    "ds_serv_man": "敏玉芳",
    "ds_serv_phone": "0931-4967596",
    "ds_serv_mail": "ncdc@lzb.ac.cn",
    "doi_value": "",
    "subject_codes": [
        "170.45"
    ],
    "quality_level": 3,
    "publish_time": "2024-11-28 11:01:03",
    "last_updated": "2025-06-30 16:19:16",
    "protected": false,
    "protected_to": null,
    "lang": "zh",
    "cstr": "11738.11.NCDC.ZENODO.DB6646.2024",
    "i18n": {
        "en": {
            "title": "Daily Ground MDA8 O3 Reconstruction Data in China (2005-2011)",
            "ds_format": "nc",
            "ds_source": "<p>&emsp;&emsp;The hourly observation data of ground ozone in Chinese Mainland from 2013 to 2021 in this study are from the network of China National Environmental Monitoring Center. Starting from about 900 monitoring stations in 2013, to about 1600 monitoring stations in 2021. We excluded negative values of O<sub>3</sub>and then calculated the daily maximum 8-hour average concentration (MDA8) of O<sub>3</sub> at each monitoring point. Due to the influence of emissions (anthropogenic and natural) and meteorological conditions on the abundance of O<sub>3</sub>in the troposphere, meteorological variables, anthropogenic emission inventories, altitude, land use, normalized difference vegetation index (NDVI), etc. are used as input variables for the machine learning model.</p>",
            "ds_quality": "<p>&emsp;&emsp;The data quality is good.</p>",
            "ds_ref_way": "",
            "ds_abstract": "<p>  Accompanied by the continuous declines of PM2.5, O<sub>3</sub>pollution has become increasingly prominent and has been targeted by the Government of China to protect climate, ecosystem, and human health. Although satellite retrievals of column O<sub>3</sub>have been operated for decades and nationwide monitoring of ground-level O<sub>3</sub>has been offered since 2013 in China, climatological variability of ground-level O<sub>3</sub>remains unknown, which impedes understanding of the long-term driver and impacts of O<sub>3</sub>pollution in China. Here we develop an eXtreme Gradient Boosting (XGBoost) model integrating high-resolution meteorological data, satellite retrievals of trace gases, etc. to provide reconstructed daily ground-level O<sub>3</sub> over 2005–2021 in China. Model validation confirms the robustness of this dataset, with R<sup>2</sup>of 0.89 for sample-based cross-validation. The accuracy of the long-term variations has also been confirmed with independent historical observations covering the same period from urban, rural and background sites. Our dataset covers the long time period of 2005–2021 with 0.1<sup>°</sup>×0.1<sup>°</sup>gap-free grids, which can facilitate climatological, ecological, and health research.</p>",
            "ds_time_res": "月、年",
            "ds_acq_place": "China",
            "ds_space_res": "",
            "ds_projection": "",
            "ds_process_way": "<p>&emsp;&emsp;We employed the extreme gradient boosting (XGBoost) (Chen and Guestrin 2016) algorithm to predict ground-level ozone (O<sub>3</sub>) concentrations using a set of related predictor variables. XGBoost is a highly efficient machine learning algorithm based on gradient tree boosting and has been widely applied in many tasks. Previously, we adopted it to correct systematic bias of chemical transport model (Yin et al., 2021). XGBoost is one of the ensemble learning techniques that combine several weak models (e.g., decision trees) to generate a strong model for better performance. The combination ways in ensemble learning  To evaluate the overall model performance in estimating daily MDA8 O<sub>3</sub>concentrations, we adopted both sample-based (out-of-sample) and station-based (out-of-station) 10-fold cross-validation (CV).</p>",
            "ds_ref_instruction": "When using data, users should clearly declare the source of the data in the main text and cite the citation method provided by this metadata in the reference section."
        }
    },
    "submit_center_id": "ncdc",
    "data_level": 0,
    "license_type": "CC BY 4.0",
    "doi_reg_from": "reg_outside",
    "cstr_reg_from": "reg_outside",
    "doi_not_reg_reason": null,
    "cstr_not_reg_reason": null,
    "is_paper_in_submitting": false,
    "ds_topic_tags": [
        "O3",
        "XGBoost",
        "高分辨率气象数据"
    ],
    "ds_subject_tags": [
        "地理学"
    ],
    "ds_class_tags": [],
    "ds_locus_tags": [
        "中国"
    ],
    "ds_time_tags": [
        2005,
        2006,
        2007,
        2008,
        2009,
        2010,
        2011,
        2012,
        2013,
        2014,
        2015,
        2016,
        2017,
        2018,
        2019,
        2020,
        2021
    ],
    "ds_contributors": [
        {
            "true_name": "高蒙",
            "email": "mmgao2@hkbu.edu.hk",
            "work_for": "香港浸会大学",
            "country": "中国"
        }
    ],
    "ds_meta_authors": [
        {
            "true_name": "高蒙",
            "email": "mmgao2@hkbu.edu.hk",
            "work_for": "香港浸会大学",
            "country": "中国"
        }
    ],
    "ds_managers": [
        {
            "true_name": "高蒙",
            "email": "mmgao2@hkbu.edu.hk",
            "work_for": "香港浸会大学",
            "country": "中国"
        }
    ],
    "category": "气象"
}