usgs-data-download

支持从美国地质调查局获取实时与历史水文数据,包括指定站点的水位高度和河流流量测量值,可按不同时间粒度下载并整合多站点观测记录,适用于水文分析、防洪预警及水资源管理等场景。

快捷安装

在终端运行此命令,即可一键安装该 Skill 到您的 Claude 中

npx skills add benchflow-ai/skillsbench --skill "usgs-data-download"

USGS Data Download Guide

Overview

This guide covers downloading water level data from USGS using the dataretrieval Python package. USGS maintains thousands of stream gages across the United States that record water levels at 15-minute intervals.

Installation

pip install dataretrieval

The NWIS module is reliable and straightforward for accessing gage height data.

from dataretrieval import nwis

# Get instantaneous values (15-min intervals)
df, meta = nwis.get_iv(
    sites='<station_id>',
    start='<start_date>',
    end='<end_date>',
    parameterCd='00065'
)

# Get daily values
df, meta = nwis.get_dv(
    sites='<station_id>',
    start='<start_date>',
    end='<end_date>',
    parameterCd='00060'
)

# Get site information
info, meta = nwis.get_info(sites='<station_id>')

Parameter Codes

CodeParameterUnitDescription
00065Gage heightfeetWater level above datum
00060DischargecfsStreamflow volume

nwis Module Functions

FunctionDescriptionData Frequency
nwis.get_iv()Instantaneous values~15 minutes
nwis.get_dv()Daily valuesDaily
nwis.get_info()Site informationN/A
nwis.get_stats()Statistical summariesN/A
nwis.get_peaks()Annual peak dischargeAnnual

Returned DataFrame Structure

The DataFrame has a datetime index and these columns:

ColumnDescription
site_noStation ID
00065Water level value
00065_cdQuality code (can ignore)

Downloading Multiple Stations

from dataretrieval import nwis

station_ids = ['<id_1>', '<id_2>', '<id_3>']
all_data = {}

for site_id in station_ids:
    try:
        df, meta = nwis.get_iv(
            sites=site_id,
            start='<start_date>',
            end='<end_date>',
            parameterCd='00065'
        )
        if len(df) > 0:
            all_data[site_id] = df
    except Exception as e:
        print(f"Failed to download {site_id}: {e}")

print(f"Successfully downloaded: {len(all_data)} stations")

Extracting the Value Column

# Find the gage height column (excludes quality code column)
gage_col = [c for c in df.columns if '00065' in str(c) and '_cd' not in str(c)]

if gage_col:
    water_levels = df[gage_col[0]]
    print(water_levels.head())

Common Issues

IssueCauseSolution
Empty DataFrameStation has no data for date rangeTry different dates or use get_iv()
get_dv() returns emptyNo daily gage height dataUse get_iv() and aggregate
Connection errorNetwork issueWrap in try/except, retry
Rate limitedToo many requestsAdd delays between requests

Best Practices

  • Always wrap API calls in try/except for failed downloads
  • Check len(df) > 0 before processing
  • Station IDs are 8-digit strings with leading zeros (e.g., ‘04119000’)
  • Use get_iv() for gage height, as daily data is often unavailable
  • Filter columns to exclude quality code columns (_cd)
  • Break up large requests into smaller time periods to avoid timeouts