{ "created": "2024-12-16 16:19:02", "updated": "2026-06-22 23:37:58", "id": "2714e3f1-dd9c-46fc-af78-0d42aec78af1", "version": 10, "ds_topic": null, "title_cn": "面向AI-Ready的路面状态标准化图像数据集（2023-2024年）", "title_en": "road surface conditions dataset（2023-2024）", "ds_abstract": "

鉴于国际上公开的标准化道路表面气象条件数据集极为稀缺，尤其在极端天气条件下的路面状况图像资源更为不足，本研究构建了一套冰雪灾害路面状况数据集，填补了这一领域的空白，为提升路面状况识别模型的性能和准确性提供了宝贵的基础资源。\n

\n

该数据集聚焦冰雪灾害条件下的路面状况，基于极端天气对交通管控影响的统计分析，将路面冰雪灾害主要划分为路面结冰、风吹雪和强降雪三种类型。数据来源包括公路摄像头、移动设备及网络资源，最终构建了涵盖六种典型路面类型的数据集：干燥路面、积雪路面、结冰路面、吹雪路面、融雪路面及湿滑路面。\n

\n

在数据处理阶段，为避免因数据增强操作引入的潜在相关性，进而影响模型性能评估的准确性与可靠性，本研究采取了相对谨慎的策略。首先，将原始数据集划分为训练集、验证集及测试集，确保各子集直接的独立性，随后针对每一个子集分别执行数据增强操作，尽可能减小子集间因增强步骤先后而产生的数据交叉影响。经过多种增强策略（如翻转、旋转、平移和高斯噪声添加）处理，数据集规模最终达到9000张。\n

\n

为进一步提升深度学习模型训练的效率和收敛速度，在相关模型进行训练时，应对数据集进行归一化处理，通常采用零均值和单位标准差的标准化方法。这里提供该数据集的均值及标准差值。该数据集在红、绿、蓝通道上的均值分别为[0.550, 0.565, 0.568]，标准差分别为[0.082, 0.082, 0.085]。

", "ds_source": "

数据来源包括公路摄像头、移动设备及网络资源等

", "ds_process_way": "

• 图像缩放：将图像调整为224×224像素，这是深度学习中常用的标准尺寸，能够在计算效率和模型性能之间实现良好的平衡。该尺寸广泛应用于基于ImageNet预训练的模型（如VGG和ResNet），并在实际应用中证明了其有效性。\n

\n

• 数据集划分：将数据集随机划分为训练集、验证集和测试集，比例分别为60%、20%和20%。\n

\n

• 亮度调整：由于路面状况复杂多变，容易出现物体遮挡和光照不均等问题，导致图像中可能存在过亮或过暗的区域，从而掩盖或模糊关键细节。此外，这些因素可能导致不同类型的路面在外观上变得相似，增加识别难度。为了解决这些问题，采用了一种基于二维伽马函数的自适应校正算法对图像的光照强度进行调整。\n

\n

• 数据增强：数据增强是解决数据集不平衡问题的重要步骤，尤其在某些类别的样本数量显著少于其他类别时。通过对现有样本进行翻转、旋转、裁剪、缩放和颜色调整等变换，生成额外的样本。本研究采用OpenCV和NumPy库进行数据增强，通过随机翻转、随机平移、随机旋转以及添加高斯噪声等方式，将图像数量增加到9000张。\n

\n

• 数据归一化：在相关模型进行训练时，像素值应被归一化为零均值和单位标准差，以加速模型的收敛过程。该数据集在红、绿、蓝通道上的均值分别为[0.550, 0.565, 0.568]，标准差分别为[0.082, 0.082, 0.085]。

", "ds_quality": "

在对数据集进行划分（训练集、验证集和测试集）之前进行数据增强，可能会在这些子集中引入潜在相关性，从而削弱验证集和测试集的独立性，影响模型性能评估的准确性和可靠性。为了解决这一问题，本研究先将数据集划分为三个独立的子集，然后对每个子集分别进行数据增强，尽可能减小子集间因增强步骤先后而产生的数据交叉影响。

", "ds_acq_start_time": "2023-10-01 00:00:00", "ds_acq_end_time": "2024-10-01 00:00:00", "ds_acq_place": "公路摄像头、移动设备及网络资源", "ds_acq_lon_east": null, "ds_acq_lat_south": null, "ds_acq_lon_west": null, "ds_acq_lat_north": null, "ds_acq_alt_low": null, "ds_acq_alt_high": null, "ds_share_type": "open-access", "ds_total_size": 301762591, "ds_files_count": 18005, "ds_format": ".jpg", "ds_space_res": "", "ds_time_res": "", "ds_coordinate": "无", "ds_projection": "", "ds_thumbnail": "885484dc-6a50-40fa-ad29-6e5e7594df1e.jpg", "ds_thumb_from": 0, "ds_ref_way": "", "paper_ref_way": "", "ds_ref_instruction": "", "ds_from_station": null, "organization_id": "952adb3f-3ede-4a94-942a-7de772f1bfc5", "ds_serv_man": "李红星", "ds_serv_phone": "0931-4967592", "ds_serv_mail": "lihongxing@lzb.ac.cn", "doi_value": "", "subject_codes": [], "quality_level": 3, "publish_time": "2024-12-17 10:38:13", "last_updated": "2026-05-28 09:19:34", "protected": false, "protected_to": null, "lang": "zh", "cstr": "11738.11.NCDC.DPRSC.DB6686.2024", "i18n": { "en": { "title": "road surface conditions dataset（2023-2024）", "ds_format": "", "ds_source": "

data sources include highway cameras, mobile devices, and online resources

", "ds_quality": "

Performing data augmentation before splitting the dataset into training, validation, and test sets may introduce potential correlations between these subsets, thereby compromising the independence of the validation and test sets and affecting the accuracy and reliability of model performance evaluation. To address this issue, this study first divides the dataset into three independent subsets and then applies data augmentation to each subset separately, minimizing the cross-contamination effects that could arise from the sequence of augmentation steps.

", "ds_ref_way": "", "ds_abstract": "

Given the scarcity of internationally recognized standardized RSCs (road surface conditions) datasets, particularly those documenting RSCs under extreme weather events, this study presents a comprehensive dataset on road conditions during snow and ice disasters. The dataset fills a critical gap in the field and provides valuable resources to enhance the performance and accuracy of RSCs recognition models.\r\n

\r\n

Focusing specifically on RSCs under snow and ice disasters, the dataset is structured based on statistical analyses of the impact of extreme weather on traffic control, categorizing RSCs into three main types: icy roads, blowing snow, and heavy snowfall. Data sources include highway cameras, mobile devices, and online resources, resulting in a dataset that encompasses six typical RSCs: dry, snowy, icy, snow-blown, melting snow, and slippery roads.\r\n

\r\n

In the data processing phase, to prevent potential correlations introduced by data augmentation that could affect the accuracy and reliability of model performance evaluation, a cautious approach was adopted. Initially, the raw dataset was divided into training, validation, and test sets, ensuring direct independence between these subsets. Subsequently, data augmentation operations, such as flipping, rotating, translating, and adding Gaussian noise, were applied separately to each subset to minimize data crossover effects caused by the sequence of augmentation steps. Following multiple augmentation strategies, the dataset was expanded to a total of 9,000 images.\r\n

\r\n

To further improve the training efficiency and convergence speed of deep learning models, normalization of the dataset is recommended. A standard approach is to apply zero-mean and unit standard deviation normalization. The mean and standard deviation values for the dataset in the red, green, and blue channels are as follows: mean = [0.550, 0.565, 0.568], standard deviation = [0.082, 0.082, 0.085].

", "ds_time_res": "", "ds_acq_place": "Highway cameras, mobile devices, and network resources", "ds_space_res": "", "ds_projection": "", "ds_process_way": "

• Image Resizing: The images are resized to 224×224 pixels, a standard dimension commonly used in deep learning that strikes a balance between computational efficiency and model performance. This size is widely employed in models pretrained on ImageNet, such as VGG and ResNet, and has been proven effective in practical applications.\r\n

\r\n

• Dataset Splitting: The dataset is randomly partitioned into training, validation, and test sets, with proportions of 60%, 20%, and 20%, respectively.\r\n

\r\n

• Brightness Adjustment: Due to the complex and dynamic nature of road conditions, issues such as object occlusion and uneven lighting often lead to overexposed or underexposed areas in the images, which can obscure or blur critical details. Furthermore, these factors may cause different types of road conditions to appear similar, increasing the difficulty of recognition. To address these challenges, an adaptive correction algorithm based on a 2D gamma function is applied to adjust the lighting intensity of the images.\r\n

\r\n

• Data Augmentation: Data augmentation is a crucial step in addressing class imbalance within the dataset, especially when certain categories have significantly fewer samples than others. By applying transformations such as flipping, rotating, cropping, scaling, and color adjustments, additional samples are generated. In this study, data augmentation is performed using the OpenCV and NumPy libraries. Techniques such as random flipping, random translation, random rotation, and the addition of Gaussian noise are employed, increasing the total number of images to 9,000.\r\n

\r\n

• Data Normalization: To accelerate the convergence of the model during training, pixel values are normalized to have zero mean and unit standard deviation. The mean and standard deviation values for the dataset in the red, green, and blue channels are [0.550, 0.565, 0.568] and [0.082, 0.082, 0.085], respectively.

", "ds_ref_instruction": "" } }, "submit_center_id": "ncdc", "data_level": 0, "recommendation_value": 0, "license_type": "https://creativecommons.org/licenses/by/4.0/", "doi_reg_from": "reg_local", "cstr_reg_from": "reg_local", "doi_not_reg_reason": null, "cstr_not_reg_reason": null, "is_paper_in_submitting": false, "belong_to_nieer": false, "ds_topic_tags": [ "路面状态", "图像识别", "深度学习" ], "ds_subject_tags": [], "ds_class_tags": [], "ds_locus_tags": [ "公路摄像头、移动设备及网络资源" ], "ds_time_tags": [ 2024 ], "ds_contributors": [ { "true_name": "刘景琦", "email": "liujingqi@nieer.ac.cn", "work_for": "中国科学院西北生态环境资源研究院", "country": "中国" } ], "ds_meta_authors": [ { "true_name": "刘景琦", "email": "liujingqi@nieer.ac.cn", "work_for": "中国科学院西北生态环境资源研究院", "country": "中国" }, { "true_name": "张耀南", "email": "yaonan@lzb.ac.cn", "work_for": "中国科学院西北生态环境资源研究院", "country": "中国" }, { "true_name": "康建芳", "email": "kangjf@lzb.ac.cn", "work_for": "中国科学院西北生态环境资源研究院", "country": "中国" }, { "true_name": "刘杰", "email": "hfutliujie@163.com", "work_for": "新疆交通规划勘察设计研究院有限公司", "country": "中国" }, { "true_name": "王斌", "email": "wangbin13245@126.com", "work_for": "新疆交通科学研究院有限责任公司科技创新研究院", "country": "中国" } ], "ds_managers": [ { "true_name": "李红星", "email": "lihongxing@lzb.ac.cn", "work_for": "中国科学院西北生态环境资源研究院", "country": "中国" } ], "category": "其他" }