{
    "created": "2026-04-03 15:53:17",
    "updated": "2026-05-18 16:37:54",
    "id": "5171e5e8-17b1-4b3b-b8a6-81e248a2d671",
    "version": 8,
    "ds_topic": null,
    "title_cn": "东北1km多年冻土地下冰储量图（2023-2024年）",
    "title_en": "1km permafrost ground ice storage map of Northeast China (2023-2024)",
    "ds_abstract": "<p>&emsp;&emsp;本数据集为东北地区多年冻土含冰量分布数据，结合实测钻孔和坑探数据，以地形因子、植被因子、气象因子和土壤与热状况因子数据为驱动，采用机器学习方法（随机森林）进行模型构建。地下冰主要在大兴安岭中部地区较厚。可靠的精度使得此多年冻土含冰量数据可以作为全球变暖背景下东北地区多年冻土含冰量模拟的标定基准和历史参考。数据格式为GeoTIFF，空间分辨率约1km，投影坐标系为Albers_Conical_Equal_Area。",
    "ds_source": "<p>&emsp;&emsp;实测钻孔和坑探数据。\n<p>&emsp;&emsp;环境变量数据：选取了地形、植被、气候、水文及土壤五大类环境变量作为预测因子。\n<p>&emsp;&emsp;地形因子：基于数字高程模型（DEM）提取海拔、坡度、坡向、地形湿度指数、地形位置指数及地形起伏度。\n<p>&emsp;&emsp;植被/水文因子：利用Landsat8遥感产品提取归一化植被指数（NDVI）、增强型植被指数（EVI）、归一化水体指数（NDWI）。\n<p>&emsp;&emsp;气象因子：地表温度（LST）和降水数据则基于产品数据，并以此推算融化指数、冻结指数，作为关键中间变量输入。",
    "ds_process_way": "<p>&emsp;&emsp;对多源栅格数据进行空间配准与标准化处理。统一地理坐标系为 WGS_1984，将空间范围裁剪至研究区边界，并采用重采样技术将所有变量的空间分辨率统一降尺度至1000 m，格式统一为GeoTIFF，确保多源数据在空间上的严格匹配。\n<p>&emsp;&emsp;利用ArcGIS的多值提取至点（Extract Multi-Values to Points）功能，提取每个样本点对应的环境变量数值，构建“样本-环境特征”高维数据集。对提取结果进行完整性检查，剔除含有缺失值（NoData）或异常值的样本，确保模型输入数据的质量。\n<p>&emsp;&emsp;随机森林模型构建：采用分层随机抽样法（Stratified Random Sampling），将数据集划分为训练集（70%）和测试集（30%）。基于Python环境下的scikit-learn机器学习库构建随机森林分类模型。针对样本不平衡问题，将class_weight参数设为 'balanced'。通过网格搜索对关键超参数进行优化，最终确定决策树数量（n_estimators）为1000，最大深度（max_depth）及节点分裂最小样本数（min_samples_split）等参数，并固定随机种子（random_state）以保证结果的可重复性。将环境变量作为特征输入，冻土类型作为标签进行模型训练。\n<p>&emsp;&emsp;精度评价：计算混淆矩阵、总体准确率（Overall Accuracy）、精确率（Precision）、召回率（Recall）、F1-Score及Kappa系数。结果显示，模型具有较高的一致性。",
    "ds_quality": "<p>&emsp;&emsp;本数据采用机器学习方法进行模型构建，计算混淆矩阵、总体准确率（Overall Accuracy）。结果显示，模型具有较高的一致性。",
    "ds_acq_start_time": "2023-01-01 00:00:00",
    "ds_acq_end_time": "2024-12-31 00:00:00",
    "ds_acq_place": "中国东北地区",
    "ds_acq_lon_east": 135.08916666666667,
    "ds_acq_lat_south": 38.730555555555554,
    "ds_acq_lon_west": 111.16111111111111,
    "ds_acq_lat_north": 53.558055555555555,
    "ds_acq_alt_low": null,
    "ds_acq_alt_high": null,
    "ds_share_type": "login-access",
    "ds_total_size": 11142401,
    "ds_files_count": 3,
    "ds_format": "*.tif",
    "ds_space_res": "1km",
    "ds_time_res": "年",
    "ds_coordinate": "WGS84",
    "ds_projection": "Albers_Conical_Equal_Area",
    "ds_thumbnail": "5171e5e8-17b1-4b3b-b8a6-81e248a2d671.jpg",
    "ds_thumb_from": 2,
    "ds_ref_way": "",
    "paper_ref_way": "",
    "ds_ref_instruction": "",
    "ds_from_station": null,
    "organization_id": "221ebf56-1b0b-4574-972b-1fb6d3cf1be7",
    "ds_serv_man": "敏玉芳",
    "ds_serv_phone": "0931-4967596",
    "ds_serv_mail": "ncdc@lzb.ac.cn",
    "doi_value": "",
    "subject_codes": [
        "170.45"
    ],
    "quality_level": 3,
    "publish_time": "2026-04-03 16:24:00",
    "last_updated": "2026-05-11 10:36:58",
    "protected": false,
    "protected_to": null,
    "lang": "zh",
    "cstr": "11738.11.NCDC.NIEER.DB7274.2026",
    "i18n": {
        "en": {
            "title": "1km permafrost ground ice storage map of Northeast China (2023-2024)",
            "ds_format": "*.tif",
            "ds_source": "<p>&emsp;Actual drilling and pit exploration data.\r\n<p>&emsp;Environmental variable data: Five major categories of environmental variables including terrain, vegetation, climate, hydrology, and soil were selected as predictive factors.\r\n<p>&emsp; &emsp; Terrain factor: Extracting altitude, slope, aspect, terrain humidity index, terrain position index, and terrain undulation based on digital elevation model (DEM).\r\n<p>&emsp;Vegetation/hydrological factors: Use Landsat 8 remote sensing products to extract normalized vegetation index (NDVI), enhanced vegetation index (EVI), and normalized water index (NDWI).\r\n<p>&emsp;Meteorological factors: Surface temperature (LST) and precipitation data are based on product data, and are used to calculate melting and freezing indices as key intermediate variable inputs.",
            "ds_quality": "<p>&emsp;This data is modeled using machine learning methods to calculate confusion matrix and overall accuracy. The results show that the model has high consistency.",
            "ds_ref_way": "",
            "ds_abstract": "<p>&emsp;This dataset is the distribution data of ice content in permafrost in Northeast China, combined with measured drilling and pit exploration data. Driven by terrain factors, vegetation factors, meteorological factors, and soil and thermal conditions factors, a machine learning method (random forest) is used to construct the model. The underground ice is mainly thicker in the central area of the Greater Khingan Range. The reliable accuracy enables this permafrost ice content data to serve as a calibration benchmark and historical reference for simulating permafrost ice content in Northeast China under the background of global warming. The data format is GeoTIFF, with a spatial resolution of approximately 1km, and the projection coordinate system is Albers_ConicalEqual-Area.",
            "ds_time_res": "",
            "ds_acq_place": "Northeast China",
            "ds_space_res": "",
            "ds_projection": "",
            "ds_process_way": "<p>&emsp;Perform spatial registration and standardization on multi-source raster data. The unified geographic coordinate system is WGS1984, and the spatial range is cropped to the boundary of the study area. The spatial resolution of all variables is uniformly downscaled to 1000 m using resampling techniques, and the format is unified as GeoTIFF to ensure strict spatial matching of multi-source data.\r\n<p>&emsp;Using ArcGIS' Extract Multi Values to Points feature, extract the environmental variable values corresponding to each sample point and construct a high-dimensional dataset of \"sample environment features\". Perform integrity checks on the extracted results, eliminate samples containing missing values (NoData) or outliers, and ensure the quality of the input data for the model.\r\n<p>&emsp;Random Forest Model Construction: Stratified Random Sampling is used to divide the dataset into a training set (70%) and a testing set (30%). Build a random forest classification model based on the scikit learn machine learning library in Python environment. To address the issue of sample imbalance, set the class_ceight parameter to 'balanced'. Optimize key hyperparameters through grid search, and ultimately determine the number of decision trees (n_estimators) to be 1000, the maximum depth (x_depth), and the minimum number of samples for node splitting (min_stamples_split), and fix the random seed (random_state) to ensure the reproducibility of the results. Use environmental variables as feature inputs and frozen soil types as labels for model training.\r\n<p>&emsp;Accuracy evaluation: Calculate confusion matrix, Overall Accuracy, Precision, Recall, F1 Score, and Kappa coefficient. The results show that the model has high consistency.",
            "ds_ref_instruction": ""
        }
    },
    "submit_center_id": "ncdc",
    "data_level": 0,
    "recommendation_value": 0,
    "license_type": "https://creativecommons.org/licenses/by/4.0/",
    "doi_reg_from": "reg_local",
    "cstr_reg_from": "reg_local",
    "doi_not_reg_reason": null,
    "cstr_not_reg_reason": null,
    "is_paper_in_submitting": false,
    "ds_topic_tags": [
        "多年冻土",
        "地下冰储量分布",
        "1km"
    ],
    "ds_subject_tags": [
        "地理学"
    ],
    "ds_class_tags": [],
    "ds_locus_tags": [
        "中国东北地区"
    ],
    "ds_time_tags": [
        2023,
        2024
    ],
    "ds_contributors": [
        {
            "true_name": "邹德富",
            "email": "defuzou@lzb.ac.cn",
            "work_for": "中国科学院西北生态环境资源研究院",
            "country": "中国"
        },
        {
            "true_name": "赵林",
            "email": "lzhao@nuist.edu.cn",
            "work_for": "南京信息工程大学",
            "country": "中国"
        }
    ],
    "ds_meta_authors": [
        {
            "true_name": "焦雪铃",
            "email": "jiaoxueling@nuist.edu.cn",
            "work_for": "南京信息工程大学",
            "country": "中国"
        }
    ],
    "ds_managers": [
        {
            "true_name": "王翀",
            "email": "wangchong2022@nuist.edu.cn",
            "work_for": "南京信息工程大学",
            "country": "中国"
        }
    ],
    "category": "冻土"
}