This dataset is the distribution data of ice content in permafrost in Northeast China, combined with measured drilling and pit exploration data. Driven by terrain factors, vegetation factors, meteorological factors, and soil and thermal conditions factors, a machine learning method (random forest) is used to construct the model. The underground ice is mainly thicker in the central area of the Greater Khingan Range. The reliable accuracy enables this permafrost ice content data to serve as a calibration benchmark and historical reference for simulating permafrost ice content in Northeast China under the background of global warming. The data format is GeoTIFF, with a spatial resolution of approximately 1km, and the projection coordinate system is Albers_ConicalEqual-Area.
| collect time | 2023/01/01 - 2024/12/31 |
|---|---|
| collect place | Northeast China |
| data size | 10.6 MiB |
| data format | *.tif |
| Data spatial resolution (/ M) | 1km |
| Data time resolution | year |
| Coordinate system | WGS84 |
Actual drilling and pit exploration data.
Environmental variable data: Five major categories of environmental variables including terrain, vegetation, climate, hydrology, and soil were selected as predictive factors.
Terrain factor: Extracting altitude, slope, aspect, terrain humidity index, terrain position index, and terrain undulation based on digital elevation model (DEM).
Vegetation/hydrological factors: Use Landsat 8 remote sensing products to extract normalized vegetation index (NDVI), enhanced vegetation index (EVI), and normalized water index (NDWI).
Meteorological factors: Surface temperature (LST) and precipitation data are based on product data, and are used to calculate melting and freezing indices as key intermediate variable inputs.
Perform spatial registration and standardization on multi-source raster data. The unified geographic coordinate system is WGS1984, and the spatial range is cropped to the boundary of the study area. The spatial resolution of all variables is uniformly downscaled to 1000 m using resampling techniques, and the format is unified as GeoTIFF to ensure strict spatial matching of multi-source data.
Using ArcGIS' Extract Multi Values to Points feature, extract the environmental variable values corresponding to each sample point and construct a high-dimensional dataset of "sample environment features". Perform integrity checks on the extracted results, eliminate samples containing missing values (NoData) or outliers, and ensure the quality of the input data for the model.
Random Forest Model Construction: Stratified Random Sampling is used to divide the dataset into a training set (70%) and a testing set (30%). Build a random forest classification model based on the scikit learn machine learning library in Python environment. To address the issue of sample imbalance, set the class_ceight parameter to 'balanced'. Optimize key hyperparameters through grid search, and ultimately determine the number of decision trees (n_estimators) to be 1000, the maximum depth (x_depth), and the minimum number of samples for node splitting (min_stamples_split), and fix the random seed (random_state) to ensure the reproducibility of the results. Use environmental variables as feature inputs and frozen soil types as labels for model training.
Accuracy evaluation: Calculate confusion matrix, Overall Accuracy, Precision, Recall, F1 Score, and Kappa coefficient. The results show that the model has high consistency.
This data is modeled using machine learning methods to calculate confusion matrix and overall accuracy. The results show that the model has high consistency.
| # | number | name | type |
| 1 | 2022FY100700 | Survey of Permafrost Conditions and Freeze-Thaw Damage in the High-Latitude Regions of Northeast China | Basic Resource Survey Project |
This work is licensed under
CC BY 4.0 (Creative Commons Attribution 4.0 International License).
| # | title | file size |
|---|---|---|
| 1 | 东北1km多年冻土地下冰储量图(2023-2024年).jpg | 2.3 MiB |
| 2 | 东北1km多年冻土地下冰储量图(2023-2024年).tif | 8.3 MiB |
| 3 | 东北1km多年冻土地下冰储量图(2023-2024年)_元数据.docx | 96.9 KiB |
| 4 | 东北1km多年冻土地下冰储量图(2023-2024年)_说明文档.docx | 22.8 KiB |
zlojrI
bFcphYeT
©Copyright 2005-. Northwest Institute of Eco-Environment and Resources, CAS.
Donggang West Road 320, Lanzhou, Gansu, China (730000)

