{
    "created": "2025-07-28 09:53:12",
    "updated": "2026-06-20 10:11:28",
    "id": "f87695e1-7a53-45a7-9203-e82d36e5099a",
    "version": 6,
    "ds_topic": null,
    "title_cn": "用于农田分割的大规模图像文本数据集基准",
    "title_en": "A large-scale image-text dataset benchmark for farmland segmentation",
    "ds_abstract": "<p>&emsp;&emsp;理解和掌握农田的时空特征对于准确的农田分割至关重要。传统的深度学习范式仅依赖于标注数据，在表示农田元素与周围环境之间的空间关系方面存在局限性。它难以有效建模农田的动态时空演变和空间异质性。语言作为一种结构化的知识载体，能够明确表达农田的时空特征，如其形状、分布以及周围环境信息。因此，基于语言驱动的学习范式可有效缓解农田时空异质性带来的挑战。然而，在农田遥感影像领域，目前尚缺乏支持该研究方向的综合性基准数据集。为填补这一空白，我们引入了基于语言的农田描述，并开发了FarmSeg-VL数据集——首个专为时空农田分割设计的细粒度图像-文本数据集。首先，本文提出了一种半自动标注方法，能够准确为每张图像分配标题，确保数据质量和语义丰富性，同时提高数据集构建的效率。其次，FarmSeg-VL 具有显著的时空特征。在时间维度上，它涵盖了四季。在空间维度上，它覆盖了中国八个典型的农业地区，总面积约为 4,300 平方公里。此外，在注释方面，FarmSeg-VL涵盖了农田丰富的时空特征，包括其固有属性、物候特征、空间分布、地形地貌特征以及周边环境的分布。最后，我们对基于FarmSeg-VL训练的视觉语言模型和仅依赖标签的深度学习模型进行了性能分析，证明其作为农田分割标准基准的潜力。",
    "ds_source": "<p>&emsp;&emsp;数据来源于：https://zenodo.org/records/15099885",
    "ds_process_way": "<p>&emsp;&emsp;模型计算得到。",
    "ds_quality": "<p>&emsp;&emsp;数据质量良好。",
    "ds_acq_start_time": null,
    "ds_acq_end_time": null,
    "ds_acq_place": "中国",
    "ds_acq_lon_east": null,
    "ds_acq_lat_south": null,
    "ds_acq_lon_west": null,
    "ds_acq_lat_north": null,
    "ds_acq_alt_low": null,
    "ds_acq_alt_high": null,
    "ds_share_type": "login-access",
    "ds_total_size": 9346623624,
    "ds_files_count": 67813,
    "ds_format": "*.png",
    "ds_space_res": "",
    "ds_time_res": "",
    "ds_coordinate": "无",
    "ds_projection": "",
    "ds_thumbnail": "f87695e1-7a53-45a7-9203-e82d36e5099a.png",
    "ds_thumb_from": 2,
    "ds_ref_way": "",
    "paper_ref_way": "",
    "ds_ref_instruction": "",
    "ds_from_station": null,
    "organization_id": "0a4269e1-65f4-45f1-aeba-88ea3068eebf",
    "ds_serv_man": "敏玉芳",
    "ds_serv_phone": "0931-4967596",
    "ds_serv_mail": "ncdc@lzb.ac.cn",
    "doi_value": "",
    "subject_codes": [
        "170.4510"
    ],
    "quality_level": 3,
    "publish_time": "2025-07-30 15:55:24",
    "last_updated": "2026-01-14 10:07:57",
    "protected": false,
    "protected_to": null,
    "lang": "zh",
    "cstr": "11738.11.NCDC.ZENODO.DB6935.2025",
    "i18n": {
        "en": {
            "title": "A large-scale image-text dataset benchmark for farmland segmentation",
            "ds_format": "",
            "ds_source": "<p>&emsp; &emsp; Data source: https://zenodo.org/records/15099885",
            "ds_quality": "<p>&emsp; &emsp; The data quality is good.",
            "ds_ref_way": "",
            "ds_abstract": "<p>   Understanding and mastering the spatiotemporal characteristics of farmland is essential for accurate farmland segmentation. The traditional deep learning paradigm that solely relies on labeled data has limitations in representing the spatial relationships between farmland elements and the surrounding environment. It struggles to effectively model the dynamic temporal evolution and spatial heterogeneity of farmland. Language, as a structured knowledge carrier, can explicitly express the spatiotemporal characteristics of farmland, such as its shape, distribution, and surrounding environmental information. Therefore, a language-driven learning paradigm can effectively alleviate the challenges posed by the spatiotemporal heterogeneity of farmland. However, in the field of remote sensing imagery of farmland, there is currently no comprehensive benchmark dataset to support this research direction. To fill this gap, we introduced language-based descriptions of farmland and developed FarmSeg-VL dataset—the first fine-grained image-text dataset designed for spatiotemporal farmland segmentation. Firstly, this article proposed a semi-automatic annotation method that can accurately assign caption to each image, ensuring high data quality and semantic richness while improving the efficiency of dataset construction. Secondly, the FarmSeg-VL exhibits significant spatiotemporal characteristics. In terms of the temporal dimension, it covers all four seasons. In terms of the spatial dimension, it covers eight typical agricultural regions across China, with a total area of approximately 4,300 km2. In addition, in terms of captions, FarmSeg-VL covers rich spatiotemporal characteristics of farmland, including its inherent properties, phenological characteristics, spatial distribution, topographic and geomorphic features, and the distribution of surrounding environments. Finally, we present a performance analysis of vision language models and the deep learning models that rely solely on labels trained on the FarmSeg-VL, demonstrating its potential as a standard benchmark for farmland segmentation.</p>",
            "ds_time_res": "",
            "ds_acq_place": "China",
            "ds_space_res": "",
            "ds_projection": "",
            "ds_process_way": "<p>&emsp; &emsp; Calculated by the model.",
            "ds_ref_instruction": ""
        }
    },
    "submit_center_id": "ncdc",
    "data_level": 0,
    "recommendation_value": 0,
    "license_type": "https://creativecommons.org/licenses/by/4.0/",
    "doi_reg_from": "reg_outside",
    "cstr_reg_from": "reg_outside",
    "doi_not_reg_reason": null,
    "cstr_not_reg_reason": null,
    "is_paper_in_submitting": false,
    "belong_to_nieer": false,
    "ds_topic_tags": [
        "FarmSeg-VL",
        "农田",
        "语义分割"
    ],
    "ds_subject_tags": [
        "自然地理学"
    ],
    "ds_class_tags": [],
    "ds_locus_tags": [
        "中国"
    ],
    "ds_time_tags": [],
    "ds_contributors": [
        {
            "true_name": "吴海洋",
            "email": "245001024@csu.edu.cn",
            "work_for": "中南大学",
            "country": "中国"
        }
    ],
    "ds_meta_authors": [
        {
            "true_name": "吴海洋",
            "email": "245001024@csu.edu.cn",
            "work_for": "中南大学",
            "country": "中国"
        }
    ],
    "ds_managers": [
        {
            "true_name": "吴海洋",
            "email": "245001024@csu.edu.cn",
            "work_for": "中南大学",
            "country": "中国"
        }
    ],
    "category": "生态"
}