With the vigorous development of big data technology, it is becoming feasible to conduct large-scale hydrological analysis of river flow patterns, which can draw reliable conclusions about hydrological processes from a global perspective. However, there is currently a lack of a comprehensive global large sample dataset for analyzing the components of stream states. This article introduces a new time series dataset that calculates global river indices based on daily river records after data quality control. This dataset contains 79 indices from 41263 river sections worldwide, covering the seven main components of flow regime (i.e. magnitude, frequency, duration, rate of change, time series, variability, and decline) on an annual and multi-year scale. The dataset covers river flow index values before 2022. The time series dataset spans from 1806 to 2022, with an average length of 36 years. Compared with existing global datasets, this global dataset covers more sites and indices, especially those that characterize the frequency, duration, rate of change, and decline of river mechanisms. With this dataset, it is easier to conduct river hydrological research without spending time processing raw river flow records. This comprehensive dataset will become a valuable resource for the hydrological community, facilitating extensive research such as hydrological behavior studies in catchment areas, prediction of water flow status in data scarce regions, and studying changes in water flow status from a global perspective.
collect time | 1806/01/01 - 2022/12/31 |
---|---|
collect place | Global scale |
data size | 1.1 GiB |
data format | excel、pdf |
Coordinate system |
The daily stream records used to establish the global river index time series dataset were collected from nine data sources, namely the Global River Drainage Center (GRDC), the United States Geological Survey (USGS) National Water Resources Information System, the Canadian National Water Resources Data Archive (HYDAT), the Brazilian National Water Agency (ANA), the Chilean Center for Climate and Adaptation Research (CCCRR), the Arctic River Observatory (ArcticGRO), the China Hydrological Yearbook (CHY), the Indian Water Resources Information System (WRIS), and the Australian Water Resources Data from the Australian Bureau of Meteorology (BOM). Except for the China Hydrological Yearbook, these data sources are publicly available. Due to the limited access and difficulty in collecting the original flow records in the Chinese Hydrological Yearbook, only flow data from some typical river basins, including 30 stations in seven major river basins in China, were collected. Among these data sources, USGS, HYDAT, ArcticGRO, and BOM provide high-quality record markers.
In the collection of water flow records, there were over 6000 stations with records spanning 32 years from 1990 to 2022, with a missing rate of less than 5%; From 1920 to 2022, there were approximately 800 stations recorded for 102 years, with a missing rate of less than 5%. As for the most recent recorded year, approximately 12000 sites had records ending in 2022, while approximately 17000 sites had records missing after 2000. Figure 3c shows the number of sites with different rates of missing records from 1900 to 2022. All curves show similar trends. The number of observation stations gradually increased from 1900, reaching its peak around 1978, fluctuating but relatively stable from 1979 to 2013, and decreasing slightly from 2014 to 2022. From 1900 to 2022, over 80% of recorded sites have no missing records every year. In addition, more than 50% of the sites have record lengths exceeding 30 years, and there have been no missing records from 1900 to 2022 each year. From 1975 to 2018, approximately 15000 sites had no missing records each year. This dataset is a global river index calculated based on daily river records after data quality control.
For standardization purposes, the original markers are converted into four markers when collecting flow field records: reliable, suspicious, unmarked, and missing. For databases without quality markings, available records can be marked as unmarked, and missing records can be marked as missing. For records with poor quality or no markings, some studies use automatic detection methods to identify and remove unreasonable traffic values, including continuous equivalence and outliers.
# | title | file size |
---|---|---|
1 | _ncdc_meta_.json | 9.6 KiB |
2 | data.rar | 1.1 GiB |
©Copyright 2005-. Northwest Institute of Eco-Environment and Resources, CAS.
Donggang West Road 320, Lanzhou, Gansu, China (730000)