One of the consequences of permafrost degradation has been the development of thermokarst lakes, which are potential hotspots for CO2 and CH4 emissions. However, due to data resolution issues, small thermokarst lakes (<500 m2) have traditionally been excluded from regional carbon emission estimates. This study employs a deep learning method combined with high-resolution (3 m) PlanetScope imagery complemented by manual corrections, to produce a new high-accuracy dataset of thermokarst lakes in the permafrost regions of the Qinghai-Tibet Plateau (QTP). A total of 329,848 thermokarst lakes were detected, covering an area of approximately 2,893 km2. This exceeds existing estimates of identified lakes by 51.4%, thereby uncovering a critical shortcoming in prior thermokarst lake estimates for the permafrost region of the QTP.
| collect time | 2020/07/01 - 2020/08/31 |
|---|---|
| collect place | Permafrost region of the Qinghai–Tibet Plateau |
| altitude | 4219.0m - 5047.0m |
| data size | 113.0 MiB |
| data format | *.shp |
| Coordinate system | WGS84 |
This study employs a deep learning method combined with high-resolution (3 m) PlanetScope imagery complemented by manual corrections, to produce a new high-accuracy dataset of thermokarst lakes in the permafrost regions of the Qinghai-Tibet Plateau (QTP).
The construction of this dataset mainly consists of two stages: deep learning–based mapping and manual refinement. In the deep learning–based mapping stage, thermokarst lakes were extracted using a U-Net model, which adopts a classic encoder-decoder architecture with skip connections to integrate low-level and high-level semantic information, thereby enhancing the network's segmentation performance. We used images from four channels: red, green, blue, and near infrared as inputs, generating binary images that indicate the presence of lakes. The existing thermokarst lake data is used to train our model, leveraging a comprehensive set of samples from the permafrost regions of the QTP that includes a variety of shapes and sizes of thermokarst lakes. Additionally, we took into account factors such as mountain shadows, ice and snow, and cloud cover that could impact the accuracy of lake extraction. During the training process, we iteratively increased the training samples with diverse features based on the feedback from the prediction results, continuing this cycle until no further improvements in training performance. A total of 4452 training samples were generated, with a training to testing ratio of 7:3. After model inference, binary output images were obtained, followed by stitching and vectorization. Glacial lakes were excluded based on China’s second glacier survey. Subsequently, we filtered thermokarst lakes based on an area threshold of less than 3 km2. To further ensure the accuracy and reliability of the dataset, the extracted vector polygons were systematically checked and manually refined, resulting in a high-precision thermokarst lake dataset.
The F1-score and IOU were adopted as the primary metrics for evaluating model performance in this dataset. The results show that the F1-score and IOU reached 0.9541 and 0.9495, respectively, indicating that the model achieves strong automated extraction performance in terms of lake identification completeness and spatial consistency. Nevertheless, both quantitative evaluation and visual interpretation reveal that certain classification uncertainties still exist in localized areas. To further improve the reliability of the extracted results, manual correction and refinement are required based on the automated classification outputs, thereby ensuring the overall accuracy and quality of the final dataset.
| # | number | name | type |
| 1 | 42161160328 | Thermokarst Landforms on the Qinghai-Tibet Plateau: spatio-temporal evolution and future changes | National Natural Science Foundation of China |
| 2 | 26RCKA014 | Research on remote sensing big data and its applications for permafrost in the Qilian Mountains | other |
| # | Dataset title |
| # | title | file size |
|---|---|---|
| 1 | TL_QTP_planet |
Thermokarst lakes Deep learning High-resolution satellite imagery
b.R0Cx
E.XPHrK_
©Copyright 2005-. Northwest Institute of Eco-Environment and Resources, CAS.
Donggang West Road 320, Lanzhou, Gansu, China (730000)

