{
    "created": "2024-01-23 17:00:38",
    "updated": "2026-05-05 20:36:19",
    "id": "61210aeb-e64c-45bc-bfd6-58920080ea70",
    "version": 3,
    "ds_topic": null,
    "title_cn": "中国二氧化硫1km空间分布数据集（2015-2018年）",
    "title_en": "Spatial distribution of various air pollutants in China at 1 km(SO2 2015-01-01:2018-03-21)",
    "ds_abstract": "<p>&emsp;&emsp;目前，在各种大气污染物的模拟中，独立痕量气体的模拟受到关键遥感产品分辨率不足的制约，导致模拟可靠性不足。本研究将空间采样和参数卷积相结合，利用地面观测、遥感产品、气象数据、援助数据和随机ID优化 LightGBM。通过上述技术和大气污染物序列模拟，我们得到了2015-2018年中国大部分地区 SO<sub>2</sub>每日1公里分辨率的无缝产品。通过随机抽样、随机站点抽样、特定区域验证、不同模型比较以及不同研究的横向比较，我们验证了我们对多种大气污染物空间分布的模拟是可靠和有效的。",
    "ds_source": "<p>&emsp;&emsp;本研究使用的数据包括中国SO<sub>2</sub>的每日地面监测数据。此外，还使用了遥感数据、气象数据和辅助数据。</p>",
    "ds_process_way": "<p>&emsp;&emsp;基于随机ID、空间采用、参数卷积和其他方法的多污染物通用机器学习模型，可在预测大气污染物浓度变化时更好地考虑多种因素，并优化对污染物空间分布的估计。我们使用CV和视觉定性分析来评估模型结果。将LightGBM、LSTM 和RF-Ps与我们的模型进行比较，以评估其性能。最后，我们使用 SHAP 尝试解释模型的输出结果。</p>",
    "ds_quality": "<p>&emsp;&emsp;数据质量良好。</p>",
    "ds_acq_start_time": "2015-01-01 00:00:00",
    "ds_acq_end_time": "2018-03-21 00:00:00",
    "ds_acq_place": "中国",
    "ds_acq_lon_east": null,
    "ds_acq_lat_south": null,
    "ds_acq_lon_west": null,
    "ds_acq_lat_north": null,
    "ds_acq_alt_low": null,
    "ds_acq_alt_high": null,
    "ds_share_type": "open-access",
    "ds_total_size": 49746972587,
    "ds_files_count": 1176,
    "ds_format": "gz、GeoTIFF",
    "ds_space_res": "1000",
    "ds_time_res": "日",
    "ds_coordinate": "无",
    "ds_projection": "WGS84",
    "ds_thumbnail": "61210aeb-e64c-45bc-bfd6-58920080ea70.png",
    "ds_thumb_from": 0,
    "ds_ref_way": "",
    "paper_ref_way": "",
    "ds_ref_instruction": "用户在使用数据时请在正文中明确声明数据的来源，并在参考文献部分引用本元数据提供的引用方式。",
    "ds_from_station": null,
    "organization_id": "0a4269e1-65f4-45f1-aeba-88ea3068eebf",
    "ds_serv_man": "敏玉芳",
    "ds_serv_phone": "0931-4967596",
    "ds_serv_mail": "ncdc@lzb.ac.cn",
    "doi_value": "",
    "subject_codes": [
        "170.15"
    ],
    "quality_level": 3,
    "publish_time": "2024-01-26 17:13:59",
    "last_updated": "2025-06-30 16:25:31",
    "protected": false,
    "protected_to": null,
    "lang": "zh",
    "cstr": "11738.11.NCDC.ZENODO.DB4167.2024",
    "i18n": {
        "en": {
            "title": "Spatial distribution of various air pollutants in China at 1 km(SO2 2015-01-01:2018-03-21)",
            "ds_format": "gz、GeoTIFF",
            "ds_source": "<p>&emsp;&emsp;The data used in this study include daily ground monitoring data for SO<sub>2</sub> in China. Additionally, remote sensing data,meteorological data, and auxiliary data are used.</p>",
            "ds_quality": "<p>&emsp; &emsp; The data quality is good. </p>",
            "ds_ref_way": "",
            "ds_abstract": "<p>  Currently, in the modeling of various atmospheric pollutants, the simulation of independent trace gases is constrained by the insufficient resolution of key remote sensing products, resulting in insufficient simulation reliability.In this study, spatial sampling and parameter convolution are combined to optimize LightGBM by utilizing ground observations, remote sensing products, meteorological data, assistance data, and random ID.Through the above techniques and an sequentialsimulation of air pollutants, we produce seamless daily 1-km-resolution products of SO<sub>2</sub>for most parts of China from 2015 to 2018.Through random sampling, random site sampling, area-specific validation, comparisons of different models, and a cross-sectional comparison of different studies, we verified that our simulations of the spatial distribution of multiple atmospheric pollutants are reliable and effective.</p>",
            "ds_time_res": "日",
            "ds_acq_place": "China",
            "ds_space_res": "1000",
            "ds_projection": "WGS84",
            "ds_process_way": "<p>&emsp;&emsp;A general machine learning model for multiple pollutants based on random ID, spatial adoption, parameter convolution, and other methods is used to improve the consideration of multiple factors in the prediction of changes in atmospheric pollutant concentrations and optimize estimates of the spatial distributions of pollutants. We evaluate the model results using CV and visual qualitative analysis. LightGBM,LSTM, and RF-Ps are compared to our model to assess its performance. Finally, SHAP is used to try to interpret the output of the model.</p>",
            "ds_ref_instruction": "\r\nWhen using data, users should clearly declare the source of the data in the main text and cite the citation method provided by this metadata in the reference section."
        }
    },
    "submit_center_id": "ncdc",
    "data_level": 0,
    "license_type": "CC BY 4.0",
    "doi_reg_from": "reg_outside",
    "cstr_reg_from": "reg_outside",
    "doi_not_reg_reason": null,
    "cstr_not_reg_reason": null,
    "is_paper_in_submitting": false,
    "ds_topic_tags": [
        "空气污染物",
        "机器学习模型优化",
        "空气污染物空间分布产品",
        "SHAP"
    ],
    "ds_subject_tags": [
        "大气科学"
    ],
    "ds_class_tags": [],
    "ds_locus_tags": [
        "中国"
    ],
    "ds_time_tags": [
        2015,
        2016,
        2017,
        2018
    ],
    "ds_contributors": [
        {
            "true_name": "叶红",
            "email": "hye@iue.ac.cn",
            "work_for": "中国科学院城市环境研究所",
            "country": "中国"
        }
    ],
    "ds_meta_authors": [
        {
            "true_name": "叶红",
            "email": "hye@iue.ac.cn",
            "work_for": "中国科学院城市环境研究所",
            "country": "中国"
        }
    ],
    "ds_managers": [
        {
            "true_name": "叶红",
            "email": "hye@iue.ac.cn",
            "work_for": "中国科学院城市环境研究所",
            "country": "中国"
        }
    ],
    "category": "生态"
}