API reference¶
Auto-generated from the AquaScope source. Every public function, class, and method appears below, with its docstring rendered in NumPy style.
If you're looking for a guided introduction, start with Getting started or Features. This page is the exhaustive reference.
High-level API¶
The most common entry points live in aquascope.api:
aquascope.api ¶
High-level convenience API for common hydrological analyses.
Provides one-liner functions wrapping AquaScope's lower-level modules. Designed for quick analyses in Jupyter notebooks and scripts.
Examples:
>>> from aquascope.api import flood_analysis, baseflow_analysis
>>> result = flood_analysis(daily_discharge, method="gev", return_periods=[10, 50, 100])
>>> bf = baseflow_analysis(daily_discharge, method="eckhardt")
flood_analysis ¶
flood_analysis(discharge: Series, method: str = 'gev', return_periods: list[int] | None = None, ci_level: float = 0.9, regional_skew: float | None = None, **kwargs) -> FloodFreqResult
Fit a flood-frequency distribution and estimate return-period quantiles.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Daily (or sub-daily) discharge time-series with a
:class: |
required |
method
|
str
|
Distribution to fit. One of |
'gev'
|
return_periods
|
list[int] | None
|
List of return periods (years) for which to estimate quantiles.
Defaults to |
None
|
ci_level
|
float
|
Confidence level for bootstrap confidence intervals (GEV only). |
0.9
|
regional_skew
|
float | None
|
Optional regional skew coefficient (LP3 only). |
None
|
**kwargs
|
Forwarded to the underlying fitting function. |
{}
|
Returns:
| Type | Description |
|---|---|
FloodFreqResult
|
Fitted distribution parameters, quantile estimates, and confidence intervals. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If method is not one of the supported methods. |
Source code in aquascope/api.py
baseflow_analysis ¶
Separate baseflow from quickflow using a digital filter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Daily discharge time-series. |
required |
method
|
str
|
|
'lyne_hollick'
|
**kwargs
|
Forwarded to the filter function (e.g. |
{}
|
Returns:
| Type | Description |
|---|---|
BaseflowResult
|
DataFrame of total / baseflow / quickflow plus BFI. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If method is not supported. |
Source code in aquascope/api.py
flow_duration ¶
Compute a flow-duration curve.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Daily discharge time-series. |
required |
**kwargs
|
Forwarded to :func: |
{}
|
Returns:
| Type | Description |
|---|---|
FDCResult
|
Exceedance probabilities, sorted discharges, and percentile values. |
Source code in aquascope/api.py
compute_all_signatures ¶
Compute a comprehensive set of hydrological signatures.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Daily discharge time-series. |
required |
**kwargs
|
Forwarded to :func: |
{}
|
Returns:
| Type | Description |
|---|---|
SignatureReport
|
Dataclass containing ~20 hydrological signature values. |
Source code in aquascope/api.py
detect_changepoints ¶
Detect abrupt shifts in a time-series.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
series
|
ndarray | Series
|
One-dimensional numeric data. |
required |
method
|
str
|
Detection algorithm. One of |
'pelt'
|
**kwargs
|
Forwarded to the detection function. |
{}
|
Returns:
| Type | Description |
|---|---|
ChangePointResult
|
Detected change-points, segment summaries, and test statistics. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If method is not supported. |
Source code in aquascope/api.py
fit_copula ¶
fit_copula(x: ndarray | Series, y: ndarray | Series, family: str = 'auto', **kwargs) -> CopulaResult
Fit a bivariate copula to paired observations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ndarray | Series
|
Paired data arrays of equal length. |
required |
y
|
ndarray | Series
|
Paired data arrays of equal length. |
required |
family
|
str
|
Copula family — |
'auto'
|
**kwargs
|
Forwarded to :func: |
{}
|
Returns:
| Type | Description |
|---|---|
CopulaResult
|
Fitted copula parameters, dependence measures, and AIC. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If family is not supported. |
Source code in aquascope/api.py
bayesian_regression ¶
bayesian_regression(X: ndarray | DataFrame, y: ndarray | Series, degree: int = 1, **kwargs) -> PosteriorResult
Fit a Bayesian linear or polynomial regression.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray | DataFrame
|
Feature matrix (degree=1) or 1-D predictor (degree>1). |
required |
y
|
ndarray | Series
|
Response variable. |
required |
degree
|
int
|
Polynomial degree. |
1
|
**kwargs
|
Forwarded to the model constructor (e.g. |
{}
|
Returns:
| Type | Description |
|---|---|
PosteriorResult
|
Posterior summaries, credible intervals, and diagnostics. |
Source code in aquascope/api.py
ensemble_forecast ¶
ensemble_forecast(models: list[tuple[str, object]], X_train: DataFrame, y_train: Series, X_test: DataFrame, method: str = 'stacking', **kwargs) -> np.ndarray
Train an ensemble of models and return predictions on X_test.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models
|
list[tuple[str, object]]
|
List of |
required |
X_train
|
DataFrame
|
Training features. |
required |
y_train
|
Series
|
Training target. |
required |
X_test
|
DataFrame
|
Test features. |
required |
method
|
str
|
Ensemble strategy — |
'stacking'
|
**kwargs
|
Forwarded to the ensemble constructor. |
{}
|
Returns:
| Type | Description |
|---|---|
ndarray
|
Predicted values for X_test. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If method is not supported. |
Source code in aquascope/api.py
generate_report ¶
Create a pre-configured :class:~aquascope.reporting.builder.ReportBuilder.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
title
|
str
|
Report title. |
required |
**kwargs
|
Forwarded to :class: |
{}
|
Returns:
| Type | Description |
|---|---|
ReportBuilder
|
A builder instance ready for method-chaining. |
Source code in aquascope/api.py
groundwater_analysis ¶
Run a groundwater analysis on a water-level time series.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
levels
|
Series
|
Water-level measurements with :class: |
required |
method
|
str
|
Analysis type — |
'trend'
|
**kwargs
|
Forwarded to the underlying function. |
{}
|
Returns:
| Type | Description |
|---|---|
dict
|
Result dataclass from the chosen analysis, accessed as a dict or the original dataclass depending on method. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If method is not supported. |
Source code in aquascope/api.py
climate_downscale ¶
climate_downscale(obs: Series, gcm_hist: Series, gcm_future: Series, method: str = 'quantile_mapping', **kwargs) -> pd.Series
Downscale a GCM projection using statistical bias correction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs
|
Series
|
Observed station data. |
required |
gcm_hist
|
Series
|
GCM historical simulation (overlapping period with obs). |
required |
gcm_future
|
Series
|
GCM future projection to downscale. |
required |
method
|
str
|
Downscaling method — |
'quantile_mapping'
|
**kwargs
|
Forwarded to the underlying function. |
{}
|
Returns:
| Type | Description |
|---|---|
Series
|
Bias-corrected future projection. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If method is not supported. |
Source code in aquascope/api.py
climate_indices ¶
climate_indices(precip: Series | None = None, temperature: Series | None = None, pet: Series | None = None, index: str = 'cdd', **kwargs) -> object
Compute a climate index from meteorological data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
precip
|
Series | None
|
Precipitation series (required for |
None
|
temperature
|
Series | None
|
Maximum temperature series (required for |
None
|
pet
|
Series | None
|
Potential evapotranspiration (required for |
None
|
index
|
str
|
Index to compute — |
'cdd'
|
**kwargs
|
Forwarded to the underlying function. |
{}
|
Returns:
| Type | Description |
|---|---|
object
|
Result dataclass or value from the chosen index. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If index is not supported or required input is missing. |
Source code in aquascope/api.py
Data collectors¶
12 unified collectors. Every collector returns records in the same Pydantic schema.
aquascope.collectors ¶
Data collectors for Taiwan and global water data sources.
AquastatCollector ¶
Bases: BaseCollector
Collect country-level water data from FAO AQUASTAT.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
CachedHTTPClient | None
|
HTTP client instance. A default is created if None. |
None
|
References
FAO. (2023). AQUASTAT. https://www.fao.org/aquastat/
fetch_raw ¶
fetch_raw(country_code: str = 'all', variable_ids: list[int] | None = None, start_year: int = 2000, end_year: int = 2023, **kwargs: Any) -> list[dict]
Fetch raw AQUASTAT data from the FAOSTAT API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country_code
|
str
|
ISO3 country code, or |
'all'
|
variable_ids
|
list[int] | None
|
AQUASTAT variable IDs. Defaults to all key variables. |
None
|
start_year
|
int
|
Start year (default 2000). |
2000
|
end_year
|
int
|
End year (default 2023). |
2023
|
Returns:
| Type | Description |
|---|---|
list[dict]
|
Raw API response records. |
normalise ¶
Convert raw FAOSTAT response into AquastatRecord objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw
|
list[dict]
|
Records from |
required |
Returns:
| Type | Description |
|---|---|
Sequence[AquastatRecord]
|
Normalised AQUASTAT records. |
BaseCollector ¶
Bases: ABC
Every collector must implement fetch_raw and normalise.
The public entry-point is collect() which chains those two steps.
CopernicusCollector ¶
Bases: BaseCollector
Fetch GloFAS river-discharge data from Copernicus CDS.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset
|
str
|
CDS dataset ID. Default is the GloFAS historical dataset. |
None
|
Example
collector = CopernicusCollector() records = collector.collect( ... latitude=48.85, longitude=2.35, ... year="2023", month=["01", "02", "03"], ... )
fetch_raw ¶
fetch_raw(*, latitude: float, longitude: float, year: str | list[str] = '2023', month: str | list[str] = '01', day: str | list[str] | None = None, variable: str = 'river_discharge_in_the_last_24_hours', product_type: str = 'consolidated', system_version: str = 'version_4_0') -> list[dict]
Fetch data via CDS API and return parsed records.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
latitude
|
float
|
Site coordinates. |
required |
longitude
|
float
|
Site coordinates. |
required |
year
|
str | list[str]
|
Temporal selection. |
'2023'
|
month
|
str | list[str]
|
Temporal selection. |
'2023'
|
day
|
str | list[str]
|
Temporal selection. |
'2023'
|
variable
|
str
|
CDS variable name. |
'river_discharge_in_the_last_24_hours'
|
product_type
|
str
|
CDS-specific dataset options. |
'consolidated'
|
system_version
|
str
|
CDS-specific dataset options. |
'consolidated'
|
normalise ¶
Convert parsed GRIB records into WaterQualitySample objects.
EUWFDCollector ¶
Bases: BaseCollector
Collect water quality data from the EEA WISE SoE Waterbase.
Uses the DiscoData SQL API published by the European Environment Agency.
fetch_raw ¶
fetch_raw(country: str | None = None, water_body_type: str = 'river', year: int | None = None, **kwargs) -> list[dict]
Fetch raw water quality records from the EEA DiscoData API.
Parameters:
country: ISO-2 country code (e.g. "DE", "FR").
water_body_type: "river", "lake", or "groundwater".
year: Calendar year to filter on.
Returns: List of raw record dicts from the API response.
normalise ¶
Convert raw EEA records into unified WaterQualitySample objects.
Parameters:
raw: List of dicts from fetch_raw.
Returns:
Sequence of WaterQualitySample instances.
GEMStatCollector ¶
Bases: BaseCollector
Collect water quality data from the GEMStat Zenodo archive.
Downloads the CSV file, parses rows, and normalises into WaterQualitySample records. Supports filtering by country.
fetch_raw ¶
fetch_raw(country: str | None = None, max_records: int = 5000, parameters: list[str] | None = None, start_date: str | None = None, end_date: str | None = None, **kwargs) -> list[dict]
Download the GEMStat Zenodo archive (once), then join station metadata with observation rows and return filtered results.
The ZIP (~200 MB) is cached to data/cache/ on first call;
subsequent calls load from the local file and are fast.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
country
|
str
|
Full or partial country name (e.g. |
None
|
max_records
|
int
|
Hard cap on returned rows across all parameters (default 5 000). |
5000
|
parameters
|
list[str]
|
Parameter CSV names without |
None
|
start_date
|
str
|
ISO date |
None
|
end_date
|
str
|
ISO date |
None
|
parse_gemstat_csv
staticmethod
¶
Parse a GEMStat CSV string into WaterQualitySample records.
Expected columns: GEMS Station Number, Sample Date, Parameter, Analysis Result, Unit, Latitude, Longitude, Country Code, etc.
JapanMLITCollector ¶
Bases: BaseCollector
Collect water data from Japan MLIT Water Information System.
Supports water-level, discharge, water-quality, and rainfall
observations. Results are normalised to WaterQualitySample
records with source = DataSource.JAPAN_MLIT.
fetch_raw ¶
fetch_raw(station_id: str | None = None, prefecture: str | None = None, parameter: str = 'water_level', start_date: str | None = None, end_date: str | None = None, **kwargs: Any) -> list[dict]
Fetch raw observation data from the MLIT Water Information System.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
station_id
|
str | None
|
MLIT station code (e.g. |
None
|
prefecture
|
str | None
|
Prefecture name in English (e.g. |
None
|
parameter
|
str
|
One of |
'water_level'
|
start_date
|
str | None
|
Start date in ISO format ( |
None
|
end_date
|
str | None
|
End date in ISO format ( |
None
|
**kwargs
|
Any
|
Additional keyword arguments forwarded to the HTTP request. |
{}
|
Returns:
| Type | Description |
|---|---|
list[dict]
|
Raw observation records. Returns an empty list when the upstream API is unreachable or no data matches. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If parameter is not one of the supported types. |
normalise ¶
Normalise raw MLIT data into WaterQualitySample records.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw
|
list[dict]
|
Raw records as returned by |
required |
Returns:
| Type | Description |
|---|---|
list[WaterQualitySample]
|
Unified water-quality sample records. |
KoreaWAMISCollector ¶
Bases: BaseCollector
Collect water data from Korea WAMIS Open API.
Supports water-level, discharge, water-quality, and dam-storage
observations. Results are normalised to WaterQualitySample
records with source = DataSource.KOREA_WAMIS.
fetch_raw ¶
fetch_raw(station_id: str | None = None, basin: str | None = None, parameter: str = 'water_level', start_date: str | None = None, end_date: str | None = None, **kwargs: Any) -> list[dict]
Fetch raw observation data from the WAMIS Open API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
station_id
|
str | None
|
WAMIS station code. |
None
|
basin
|
str | None
|
Basin name in English (e.g. |
None
|
parameter
|
str
|
One of |
'water_level'
|
start_date
|
str | None
|
Start date in ISO format ( |
None
|
end_date
|
str | None
|
End date in ISO format ( |
None
|
**kwargs
|
Any
|
Additional keyword arguments forwarded to the HTTP request. |
{}
|
Returns:
| Type | Description |
|---|---|
list[dict]
|
Raw observation records. Returns an empty list when the upstream API is unreachable or no data matches. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If parameter is not one of the supported types. |
normalise ¶
Normalise raw WAMIS data into WaterQualitySample records.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw
|
list[dict]
|
Raw records as returned by |
required |
Returns:
| Type | Description |
|---|---|
list[WaterQualitySample]
|
Unified water-quality sample records. |
OpenMeteoCollector ¶
Bases: BaseCollector
Fetch weather, climate reanalysis, and river-discharge data from Open-Meteo.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mode
|
str
|
|
'weather'
|
Example
collector = OpenMeteoCollector(mode="weather") records = collector.collect( ... latitude=25.03, longitude=121.57, ... start_date="2023-01-01", end_date="2023-12-31", ... daily=["temperature_2m_mean", "precipitation_sum"], ... )
fetch_raw ¶
fetch_raw(*, latitude: float, longitude: float, start_date: str | None = None, end_date: str | None = None, daily: list[str] | None = None, hourly: list[str] | None = None, forecast_days: int = 7) -> dict
Call the Open-Meteo API and return the raw JSON response.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
latitude
|
float
|
Site coordinates. |
required |
longitude
|
float
|
Site coordinates. |
required |
start_date
|
str | None
|
ISO-8601 date strings (required for archive mode). |
None
|
end_date
|
str | None
|
ISO-8601 date strings (required for archive mode). |
None
|
daily
|
list[str] | None
|
Daily variables to request (e.g. |
None
|
hourly
|
list[str] | None
|
Hourly variables (e.g. |
None
|
forecast_days
|
int
|
Number of forecast days (only for |
7
|
normalise ¶
Convert Open-Meteo JSON into unified WaterQualitySample records.
Weather / climate variables are stored as WaterQualitySample with
parameter set to the variable name (e.g. precipitation_sum).
SDG6Collector ¶
Bases: BaseCollector
Collect SDG 6 indicator data per country/year from the UN-Stats API.
fetch_raw ¶
fetch_raw(indicator_codes: list[str] | None = None, country_codes: str | None = None, page_size: int = 200, **kwargs) -> list[dict]
Fetch SDG 6 indicator records.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
indicator_codes
|
list[str] | None
|
e.g. ["6.4.2", "6.4.1"]. Defaults to all 9 SDG 6 indicators. |
None
|
country_codes
|
str | None
|
Comma-separated ISO3 or M49 numeric codes, e.g. "USA,DEU,IND". Omit to return data for all countries. |
None
|
page_size
|
int
|
Records per API page (max 5000 per UN docs). |
200
|
TaiwanCivilIoTCollector ¶
Bases: BaseCollector
Collect real-time water resource data from Taiwan's Civil IoT SensorThings API.
The entity parameter is consumed by :meth:fetch_raw, not by
__init__; see that method's docstring for valid values.
fetch_raw ¶
fetch_raw(entity: str = 'Datastreams', top: int = 100, expand: str | None = None, start_date: str | None = None, end_date: str | None = None, **kwargs) -> list[dict]
Fetch SensorThings entities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
entity
|
str
|
|
'Datastreams'
|
top
|
int
|
Max items per page. |
100
|
expand
|
str
|
OData $expand clause. If not given, one is built from
|
None
|
start_date
|
str
|
ISO date string |
None
|
end_date
|
str
|
ISO date string |
None
|
normalise ¶
Normalise SensorThings Datastreams with their latest Observation.
TaiwanDataGovCollector ¶
TaiwanDataGovCollector(dataset_id: str = DATASET_WATER_LEVEL, client: CachedHTTPClient | None = None)
Bases: BaseCollector
Collect real-time water level data from Taiwan's open government data platform (data.gov.tw).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_id
|
str
|
Dataset identifier. Use |
DATASET_WATER_LEVEL
|
fetch_raw ¶
Page through the data.gov.tw API for the configured dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
limit
|
int
|
Records per page (max 1000). |
1000
|
offset
|
int
|
Starting record offset. |
0
|
TaiwanMOENVCollector ¶
TaiwanMOENVCollector(api_key: str = '', dataset_id: str = RIVER_WQ_DATASET, client: CachedHTTPClient | None = None)
Bases: BaseCollector
Collect river water-quality monitoring data from Taiwan MOENV.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_key
|
str
|
Free key obtained at https://data.moenv.gov.tw/en/apikey |
''
|
dataset_id
|
str
|
Dataset identifier (default: river water quality |
RIVER_WQ_DATASET
|
fetch_raw ¶
Page through MOENV open-data endpoint and return raw records.
TaiwanWRAReservoirCollector ¶
TaiwanWRAWaterLevelCollector ¶
Bases: BaseCollector
Collect real-time water-level readings from WRA river stations. Updated every 10 minutes at source.
TaiwanWRAFhyCollector ¶
Bases: BaseCollector
Collect real-time hydrological data from the WRA 防災資訊網 (Fhy) API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_type
|
str
|
One of |
'water'
|
TaiwanWRAIoTCollector ¶
Bases: BaseCollector
Collect real-time hydrological data from the WRA IoT open-data API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_type
|
str
|
One of |
'groundwater'
|
fetch_raw ¶
Fetch real-time data for the configured data type.
Tries each candidate path in order with an Accept: application/json
header. CachedHTTPClient.get_json strips any BOM / leading
whitespace and checks Content-Type before parsing, so non-JSON bodies
surface as ValueError rather than an opaque JSONDecodeError.
On failure the first 200 chars of the response body are logged to help diagnose whether the server returned HTML, XML, or malformed JSON.
USGSCollector ¶
Bases: BaseCollector
Collect daily-value water data from USGS via OGC API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
api_key
|
str
|
Optional USGS API key for higher rate limits (get one at https://api.waterdata.usgs.gov/docs/ogcapi/#api-keys). |
'DEMO_KEY'
|
fetch_raw ¶
fetch_raw(collection: str = 'daily', datetime_range: str | None = None, days: int | None = None, limit: int = 10000, bbox: str | None = None, max_items: int | None = 2000, **kwargs) -> list[dict]
Fetch features from a USGS OGC collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
collection
|
str
|
|
'daily'
|
datetime_range
|
str
|
Explicit ISO 8601 interval |
None
|
days
|
int
|
Last N days from now (UTC). Defaults to 30 when |
None
|
limit
|
int
|
Max features per page. Larger values mean fewer round-trips. |
10000
|
bbox
|
str
|
Bounding box filter |
None
|
max_items
|
int
|
Hard cap on total records fetched (across all pages). Keeps response
times predictable. |
2000
|
WaPORCollector ¶
Bases: BaseCollector
Collect satellite-based ET data from FAO WaPOR v3.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
CachedHTTPClient | None
|
HTTP client instance. A default is created if None. |
None
|
References
FAO. (2024). WaPOR v3. https://www.fao.org/in-action/remote-sensing-for-water-productivity/
fetch_raw ¶
fetch_raw(bbox: tuple[float, float, float, float] | None = None, start_date: str = '2020-01-01', end_date: str = '2020-12-31', variable: str = 'RET', **kwargs: Any) -> list[dict]
Fetch raw WaPOR raster catalogue/summary data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bbox
|
tuple[float, float, float, float] | None
|
Bounding box as |
None
|
start_date
|
str
|
ISO date string for start of period. |
'2020-01-01'
|
end_date
|
str
|
ISO date string for end of period. |
'2020-12-31'
|
variable
|
str
|
WaPOR variable code ( |
'RET'
|
Returns:
| Type | Description |
|---|---|
list[dict]
|
Raw API response records. |
normalise ¶
Convert raw WaPOR response into WaPORObservation records.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw
|
list[dict]
|
Records from |
required |
Returns:
| Type | Description |
|---|---|
Sequence[WaPORObservation]
|
Normalised WaPOR observations. |
WQPCollector ¶
Bases: BaseCollector
Collect discrete water quality data from the US Water Quality Portal.
Supports filtering by state, county, characteristic (parameter), date range, and bounding box.
fetch_raw ¶
fetch_raw(state_code: str | None = None, characteristic_name: str | None = None, start_date: str | None = None, end_date: str | None = None, bbox: str | None = None, max_results: int = 1000, **kwargs) -> list[dict]
Fetch water quality results from WQP.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state_code
|
str | None
|
e.g. |
None
|
characteristic_name
|
str | None
|
e.g. |
None
|
start_date
|
str | None
|
|
None
|
end_date
|
str | None
|
|
None
|
bbox
|
str | None
|
Bounding box: |
None
|
max_results
|
int
|
Limit number of results (WQP default returns CSV). |
1000
|
AI engine: methodology recommender¶
aquascope.ai_engine ¶
AI-powered research methodology recommendation engine.
AgentResult
dataclass
¶
AgentResult(challenge_spec: ChallengeSpec, model_used: str = '', forecast: DataFrame | None = None, risk_assessment: dict | None = None, status: dict | None = None, anomalies: DataFrame | None = None, explanation: str = '', steps: list[str] = list())
Result of an end-to-end agent run.
HydroAgent ¶
Orchestrator that parses a natural-language query and executes the corresponding challenge.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
default_model
|
str | None
|
Override the auto-recommended model. If |
None
|
Example
agent = HydroAgent() result = agent.solve("Drought monitoring near Sahel at lat 15, lon 0") print(result.explanation)
solve ¶
solve(query: str, data: DataFrame | None = None, extra_data: dict[str, DataFrame] | None = None) -> AgentResult
Parse query, load data, pick model, and run the challenge.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Natural-language challenge description. |
required |
data
|
DataFrame | None
|
Primary data (discharge / precipitation). If |
None
|
extra_data
|
dict[str, DataFrame] | None
|
Additional named DataFrames (e.g. |
None
|
Returns:
| Type | Description |
|---|---|
AgentResult
|
|
ResearchMethodology
dataclass
¶
ResearchMethodology(id: str, name: str, category: str, description: str, applicable_parameters: list[str] = list(), data_requirements: list[str] = list(), typical_scale: str = '', complexity: str = '', references: list[str] = list(), tags: list[str] = list())
Describes a single research methodology applicable to water studies.
ModelRecommendation
dataclass
¶
A single model recommendation.
ModelRecommender ¶
Recommend predictive models for a given challenge type and task.
Example
rec = ModelRecommender() picks = rec.recommend("flood", "forecast") picks[0].model_id 'prophet'
recommend ¶
recommend(challenge_type: str, task_type: str = 'forecast', top_k: int = 3) -> list[ModelRecommendation]
Return ranked model recommendations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
challenge_type
|
str
|
One of |
required |
task_type
|
str
|
Task within the challenge: |
'forecast'
|
top_k
|
int
|
Maximum number of recommendations to return. |
3
|
Returns:
| Type | Description |
|---|---|
list[ModelRecommendation]
|
|
ChallengePlanner ¶
Parse natural-language descriptions into structured challenge specs.
Example
planner = ChallengePlanner() spec = planner.parse("Forecast flooding on the Niger River at lat 13.5, lon 2.1") spec.challenge_type 'flood'
parse ¶
Parse a natural-language query into a ChallengeSpec.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Free-text description of the challenge. |
required |
Returns:
| Type | Description |
|---|---|
ChallengeSpec
|
Structured challenge with type, variables, and location. |
ChallengeSpec
dataclass
¶
ChallengeSpec(challenge_type: str = 'unknown', variables: list[str] = list(), latitude: float | None = None, longitude: float | None = None, location_name: str | None = None, forecast_days: int = 7, raw_query: str = '', confidence: float = 0.0)
Structured representation of a parsed challenge request.
DatasetProfile
dataclass
¶
DatasetProfile(parameters: list[str] = list(), n_records: int = 0, n_stations: int = 0, time_span_years: float = 0.0, geographic_scope: str = '', data_sources: list[str] = list(), research_goal: str = '', keywords: list[str] = list())
Summary of a collected dataset, used as input to the recommender.
Recommendation
dataclass
¶
A single methodology recommendation with a relevance score.
get_methodology ¶
Look up a methodology by ID.
search_methodologies ¶
search_methodologies(parameters: list[str] | None = None, category: str | None = None, tags: list[str] | None = None, scale: str | None = None) -> list[ResearchMethodology]
Filter the catalogue by parameters, category, tags, or scale.
recommend ¶
Return the top-k methodology recommendations for the given dataset profile.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
profile
|
DatasetProfile
|
|
required |
top_k
|
int
|
|
5
|
min_score
|
float
|
Minimum relevance score to include. |
20.0
|
Returns:
| Type | Description |
|---|---|
list[Recommendation] sorted by descending score.
|
|
recommend_with_llm ¶
recommend_with_llm(profile: DatasetProfile, top_k: int = 5, model: str = 'gpt-4o-mini', api_key: str | None = None, base_url: str | None = None, timeout: float = 120.0) -> list[Recommendation]
Use an LLM to provide more nuanced methodology recommendations.
Falls back to rule-based if the LLM call fails.
Supported providers (detected from base_url): - OpenAI (base_url=None or api.openai.com) - HuggingFace Inference API (api-inference.huggingface.co) — free tier - Groq (api.groq.com) — free tier - Ollama local (localhost) — uses native /api/chat to avoid openai-client quirks
Hydrological analysis¶
Flood-frequency analysis, baseflow separation, rating curves, signatures.
aquascope.hydrology ¶
AquaScope hydrology module.
Standard hydrological analysis tools:
- Flow duration curves and low-flow statistics (Q95, 7Q10, 30Q5)
- Baseflow separation (Lyne–Hollick, Eckhardt digital filters)
- Recession analysis (segment identification + MRC fitting)
- Flood frequency analysis (GEV, Log-Pearson Type III)
- Stage-discharge rating curves (power-law fit, segmented curves, shift detection)
Quick start::
from aquascope.hydrology import flow_duration_curve, lyne_hollick, recession_analysis, fit_gev
fdc = flow_duration_curve(discharge_series)
print(f"Q95 = {fdc.percentiles[95]:.2f} m³/s")
bf = lyne_hollick(discharge_series)
print(f"BFI = {bf.bfi:.2f}")
rec = recession_analysis(discharge_series)
print(f"Recession constant = {rec.recession_constant:.1f} days")
ffa = fit_gev(discharge_series)
print(f"100-year flood = {ffa.return_periods[100]:.1f} m³/s")
BaseflowResult
dataclass
¶
Result of baseflow separation.
Attributes:
| Name | Type | Description |
|---|---|---|
df |
DataFrame
|
DataFrame with |
bfi |
float
|
Baseflow Index — ratio of total baseflow to total discharge. |
method |
str
|
Name of the filter used. |
EMAResult
dataclass
¶
EMAResult(return_periods: dict[int, float] = dict(), distribution: str = '', params: tuple = (), annual_max: Series | None = None, confidence_intervals: dict[int, tuple[float, float]] = dict(), n_censored: int = 0, n_observed: int = 0, weighted_skew: float | None = None, low_outlier_threshold: float | None = None)
Bases: FloodFreqResult
Result of Expected Moments Algorithm flood frequency analysis.
Extends :class:FloodFreqResult with censoring and EMA-specific fields.
Attributes:
n_censored: Number of censored (zero-flow / low-outlier) observations.
n_observed: Number of non-censored observations.
weighted_skew: Weighted skew coefficient (station + regional).
None when no regional skew was supplied.
low_outlier_threshold: MGB low-outlier threshold (real-space).
None when no outliers were detected.
FloodFreqResult
dataclass
¶
FloodFreqResult(return_periods: dict[int, float] = dict(), distribution: str = '', params: tuple = (), annual_max: Series | None = None, confidence_intervals: dict[int, tuple[float, float]] = dict())
Result of flood frequency analysis.
Attributes:
| Name | Type | Description |
|---|---|---|
return_periods |
dict[int, float]
|
Mapping of return period (years) → estimated discharge. |
distribution |
str
|
Name of the fitted distribution. |
params |
tuple
|
Distribution parameters (shape, loc, scale). |
annual_max |
Series | None
|
The annual maximum series used for fitting. |
confidence_intervals |
dict[int, tuple[float, float]]
|
Optional mapping of return period → (lower, upper) 90 % CI. |
GoodnessOfFitResult
dataclass
¶
GoodnessOfFitResult(statistic: float = 0.0, p_value: float = 1.0, test_name: str = '', distribution: str = '', reject_h0: bool = False)
Result of a goodness-of-fit test.
Attributes:
statistic: Test statistic value.
p_value: Associated p-value.
test_name: Name of the test.
distribution: Distribution that was tested.
reject_h0: True if H₀ (data follows distribution) is rejected at α = 0.05.
NonStationaryGEVResult
dataclass
¶
NonStationaryGEVResult(loc_intercept: float = 0.0, loc_trend: float = 0.0, scale: float = 1.0, shape: float = 0.0, return_levels: dict[float, ndarray] = dict(), years: ndarray = (lambda: np.array([]))(), aic: float = 0.0, bic: float = 0.0, trend_significant: bool = False)
Result of non-stationary GEV fit.
Attributes:
loc_intercept: Intercept of the linear location model.
loc_trend: Trend in the location parameter (per year).
scale: Scale parameter.
shape: Shape parameter (scipy sign convention).
return_levels: Mapping of return period → array of return levels over time.
years: The year values used for fitting.
aic: Akaike information criterion.
bic: Bayesian information criterion.
trend_significant: True if the trend p-value < 0.05 (likelihood-ratio test).
RegionalResult
dataclass
¶
RegionalResult(growth_curve: dict[float, float] = dict(), index_flood: dict[str, float] = dict(), regional_return_levels: dict[str, dict[float, float]] = dict(), discordancy: dict[str, float] = dict(), heterogeneity: float = 0.0)
Result of regional frequency analysis.
Attributes: growth_curve: Mapping of return period → growth factor. index_flood: Mapping of site id → index flood (mean annual max). regional_return_levels: Mapping of site id → {return period → level}. discordancy: Mapping of site id → discordancy statistic Dᵢ. heterogeneity: Heterogeneity H statistic.
FDCResult
dataclass
¶
Result of a flow duration curve analysis.
Attributes:
| Name | Type | Description |
|---|---|---|
exceedance |
ndarray
|
Exceedance probability array (0–100 %). |
discharge |
ndarray
|
Sorted discharge values (descending). |
percentiles |
dict[float, float]
|
Mapping of exceedance % → discharge value (e.g. |
RatingCurveResult
dataclass
¶
RatingCurveResult(a: float, b: float, h0: float, r_squared: float, rmse: float, n_points: int, residuals: ndarray, stage_range: tuple[float, float], segments: list[RatingSegment] | None = None)
Result of rating curve fitting.
Attributes:
a: Power-law coefficient.
b: Power-law exponent.
h0: Stage offset (datum correction).
r_squared: Coefficient of determination.
rmse: Root mean squared error.
n_points: Number of stage-discharge pairs used.
residuals: Array of residuals (observed - predicted).
stage_range: Tuple of (min_stage, max_stage).
segments: Segment parameters for segmented curves, or None.
RatingSegment
dataclass
¶
A segment of a segmented rating curve.
Attributes: stage_min: Lower bound of the segment stage range. stage_max: Upper bound of the segment stage range. a: Power-law coefficient. b: Power-law exponent. h0: Stage offset (datum correction). r_squared: Coefficient of determination for the segment.
RecessionResult
dataclass
¶
RecessionResult(segments: list[RecessionSegment] = list(), recession_constant: float = 0.0, r_squared: float = 0.0, half_life_days: float = 0.0)
Result of recession analysis.
Attributes:
| Name | Type | Description |
|---|---|---|
segments |
list[RecessionSegment]
|
List of identified recession segments. |
recession_constant |
float
|
Fitted exponential decay constant k in Q(t) = Q₀·e^(−t/k). |
r_squared |
float
|
R² of the master recession curve fit. |
half_life_days |
float
|
Time in days for discharge to halve (k·ln(2)). |
RecessionSegment
dataclass
¶
A single recession segment.
Attributes:
| Name | Type | Description |
|---|---|---|
start |
Timestamp
|
Start timestamp. |
end |
Timestamp
|
End timestamp. |
discharge |
ndarray
|
Discharge values during the recession. |
SignatureReport
dataclass
¶
SignatureReport(mean_flow: float, median_flow: float, q5: float, q95: float, q5_q95_ratio: float, cv: float, iqr: float, high_flow_frequency: float, high_flow_duration: float, q_peak_mean: float, low_flow_frequency: float, low_flow_duration: float, baseflow_index: float, zero_flow_fraction: float, peak_month: int, seasonality_index: float, rising_limb_density: float, flashiness_index: float, mean_recession_constant: float, runoff_ratio: float | None, elasticity: float | None)
Complete set of hydrological signatures for a streamflow record.
Attributes are grouped by the aspect of the flow regime they describe.
Fields set to None require optional inputs (e.g. precipitation).
eckhardt ¶
Separate baseflow using the Eckhardt two-parameter digital filter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Daily discharge series with a DatetimeIndex. |
required |
alpha
|
float
|
Recession constant (typically 0.95–0.99). |
0.98
|
bfi_max
|
float
|
Maximum baseflow index — depends on aquifer type: - 0.80 for perennial streams with porous aquifers - 0.50 for ephemeral streams with porous aquifers - 0.25 for perennial streams with hard-rock aquifers |
0.8
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
class:`BaseflowResult` with separated components and BFI.
|
|
lyne_hollick ¶
Separate baseflow using the Lyne–Hollick recursive digital filter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Daily discharge series with a DatetimeIndex. |
required |
alpha
|
float
|
Filter parameter (0 < α < 1). Higher values yield less baseflow. Default 0.925 is the value recommended by Nathan & McMahon (1990). |
0.925
|
n_passes
|
int
|
Number of forward/backward filter passes (typically 3). |
3
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
class:`BaseflowResult` with separated components and BFI.
|
|
anderson_darling_test ¶
Anderson-Darling goodness-of-fit test for a fitted distribution.
Parameters:
data: Observed sample.
distribution: Distribution name (e.g. "gev", "gumbel").
params: Distribution parameters as accepted by the scipy distribution.
Returns:
A :class:GoodnessOfFitResult.
coverage_probability ¶
coverage_probability(discharge: Series, distribution: str = 'gev', ci_level: float = 0.9, n_splits: int = 10, n_boot: int = 200) -> float
Estimate the coverage probability of confidence intervals.
Split data into n_splits folds, compute bootstrap CIs on each training set, and check what fraction of test observations fall within those CIs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Annual maximum discharge series. |
required |
distribution
|
str
|
Distribution key. |
'gev'
|
ci_level
|
float
|
Nominal confidence level (default 0.90). |
0.9
|
n_splits
|
int
|
Number of cross-validation folds (default 10). |
10
|
n_boot
|
int
|
Number of bootstrap samples per fold (default 200). |
200
|
Returns:
| Type | Description |
|---|---|
float
|
Observed coverage probability (0–1). |
cramer_von_mises_test ¶
Cramér–von Mises goodness-of-fit test.
Parameters: data: Observed sample. distribution: Distribution name. params: Distribution parameters.
Returns:
A :class:GoodnessOfFitResult.
expected_moments_algorithm ¶
expected_moments_algorithm(annual_max: Series | ndarray, *, perception_thresholds: list[tuple[float, float]] | None = None, zero_threshold: float = 0.0, regional_skew: float | None = None, regional_skew_mse: float = 0.302, return_periods: list[int] | None = None) -> EMAResult
Expected Moments Algorithm (EMA) for LP3 flood frequency analysis.
Implements the EMA procedure of Cohn et al. (1997) for incorporating censored observations (zero-flow years, low outliers, historical floods) into the LP3 parameter estimation. This is the preferred method in USGS Bulletin 17C (England et al., 2018).
Parameters:
annual_max: Annual maximum series. May contain zero / negative values
which will be treated as censored.
perception_thresholds: Optional list of (lower, upper) pairs
defining the perception interval for each observation. When
None the algorithm automatically treats observations
≤ zero_threshold as left-censored and the remainder as
exactly observed.
zero_threshold: Values ≤ this are treated as censored (default 0.0).
regional_skew: Regional / generalised skew for weighted skew.
regional_skew_mse: MSE of the regional skew estimate.
return_periods: Return periods to estimate. Defaults to standard set.
Returns:
An :class:EMAResult with LP3 quantile estimates, confidence
intervals, and censoring metadata.
Raises: ValueError: If fewer than 5 total observations.
References: England, J. F. Jr., Cohn, T. A., Faber, B. A., Stedinger, J. R., Thomas, W. O. Jr., Veilleux, A. G., Kiang, J. E., & Mason, R. R. Jr. (2018). Guidelines for determining flood flow frequency — Bulletin 17C. USGS TM 4-B5. https://doi.org/10.3133/tm4B5
Cohn, T. A., Lane, W. L., & Baier, W. G. (1997). An algorithm for
computing moments-based flood quantile estimates when historical
flood information is available. Water Resources Research, 33(9),
2089-2096. https://doi.org/10.1029/96WR03706
fit_gev ¶
fit_gev(discharge: Series, *, return_periods: list[int] | None = None, ci_level: float = 0.9) -> FloodFreqResult
Fit a GEV distribution to the annual maximum series.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Daily discharge series with a DatetimeIndex. |
required |
return_periods
|
list[int] | None
|
Return periods in years to estimate. Defaults to
|
None
|
ci_level
|
float
|
Confidence level for bootstrap CIs (default 0.90). |
0.9
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
class:`FloodFreqResult` with quantile estimates.
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If fewer than 5 annual maxima are available. |
fit_gev_lmoments ¶
fit_gev_lmoments(annual_maxima: ndarray | Series, return_periods: list[float] | None = None) -> FloodFreqResult
Fit GEV distribution using L-moments method.
More robust than MLE for small samples (n < 50). The shape parameter k is estimated from L-skewness using the Hosking (1997) approximation:
c = 2 / (3 + t3) − ln2 / ln3
k ≈ 7.8590 c + 2.9554 c²
Parameters: annual_maxima: Array of annual maximum values. return_periods: Return periods in years. Defaults to standard set.
Returns:
A :class:FloodFreqResult with fitted parameters and return levels.
Raises: ValueError: If fewer than 5 values are provided.
fit_gpd ¶
fit_gpd(exceedances: ndarray | Series, threshold: float, return_periods: list[float] | None = None, total_observations: int | None = None) -> FloodFreqResult
Fit Generalised Pareto Distribution using Peaks-Over-Threshold method.
Parameters:
exceedances: Values above the threshold (already filtered).
threshold: The threshold used for POT selection.
return_periods: Return periods in years.
total_observations: Total number of observations used to compute
exceedance rate. If None, assumed equal to len(exceedances).
Returns:
A :class:FloodFreqResult with fitted parameters and return levels.
Raises: ValueError: If fewer than 10 exceedances are provided.
fit_gumbel ¶
fit_gumbel(annual_maxima: ndarray | Series, return_periods: list[float] | None = None) -> FloodFreqResult
Fit Gumbel (Type I) extreme value distribution.
Special case of GEV with shape=0. Uses scipy.stats.gumbel_r (MLE).
Gumbel CDF: F(x) = exp(-exp(-(x - loc) / scale))
Parameters: annual_maxima: Array of annual maximum values. return_periods: Return periods in years. Defaults to standard set.
Returns:
A :class:FloodFreqResult with fitted parameters and return levels.
Raises: ValueError: If fewer than 5 values are provided.
fit_lp3 ¶
fit_lp3(discharge: Series, *, return_periods: list[int] | None = None, regional_skew: float | None = None, regional_skew_mse: float = 0.302, ci_level: float = 0.9, zero_threshold: float = 0.0) -> FloodFreqResult
Fit a Log-Pearson Type III distribution (Bulletin 17C approach).
When regional_skew is provided the station skew is adjusted using the inverse-variance weighted average described in Bulletin 17C §5.2.4 (England et al., 2018). Confidence intervals are computed via the variance-of-estimate approach (Bulletin 17C §6).
Parameters:
discharge: Daily discharge series with a DatetimeIndex.
return_periods: Return periods to estimate. Defaults to standard set.
regional_skew: Generalised / regional skew coefficient. When
None (default) the station skew is used unmodified for
backward compatibility.
regional_skew_mse: Mean-square error of the regional skew estimate.
Default 0.302 is the USGS nationwide value.
ci_level: Confidence level for return-period intervals (0 < ci < 1).
zero_threshold: Values ≤ this are excluded before fitting.
Returns:
A :class:FloodFreqResult with quantile estimates and optional CIs.
Raises: ValueError: If fewer than 5 annual maxima or all values ≤ zero_threshold.
References: England, J. F. Jr., Cohn, T. A., Faber, B. A., Stedinger, J. R., Thomas, W. O. Jr., Veilleux, A. G., Kiang, J. E., & Mason, R. R. Jr. (2018). Guidelines for determining flood flow frequency — Bulletin 17C. U.S. Geological Survey Techniques and Methods 4-B5. https://doi.org/10.3133/tm4B5
fit_nonstationary_gev ¶
fit_nonstationary_gev(annual_maxima: ndarray | Series, years: ndarray | Series, return_periods: list[float] | None = None) -> NonStationaryGEVResult
Fit GEV with time-varying location: loc(t) = mu0 + mu1 * (t − t̄).
The trend significance is assessed via a likelihood-ratio test comparing the non-stationary model to a stationary GEV.
Parameters: annual_maxima: Array of annual maximum values. years: Corresponding year values (same length as annual_maxima). return_periods: Return periods in years. Defaults to standard set.
Returns:
A :class:NonStationaryGEVResult.
Raises: ValueError: If input arrays differ in length or have fewer than 10 values.
fit_weibull_min ¶
fit_weibull_min(annual_minima: ndarray | Series, return_periods: list[float] | None = None) -> FloodFreqResult
Fit Weibull distribution to annual minima for low-flow frequency analysis.
Uses scipy.stats.weibull_min (MLE). Return periods relate to the
probability of flows below a given level.
Parameters: annual_minima: Array of annual minimum values. return_periods: Return periods in years. Defaults to standard set.
Returns:
A :class:FloodFreqResult with fitted parameters and return levels.
Raises: ValueError: If fewer than 5 values are provided.
grubbs_beck_test ¶
Multiple Grubbs-Beck (MGB) test for low-outlier detection.
Implements the iterative procedure described in Bulletin 17C Appendix 6 (Cohn et al., 2013). The test repeatedly identifies the smallest observation that is significantly low relative to the remaining sample.
Parameters: annual_max: 1-D array of annual maximum values (all positive). alpha: Significance level for the test.
Returns:
A 2-tuple (threshold, mask) where threshold is the low-outlier
cutoff (log10 scale back-transformed) and mask is a boolean array
with True for observations identified as low outliers.
References: Cohn, T. A., England, J. F. Jr., Berenbrock, C. E., Mason, R. R., Stedinger, J. R., & Lamontagne, J. R. (2013). A generalized Grubbs-Beck test statistic for detecting multiple potentially influential low outliers in flood series. Water Resources Research, 49(8), 5047-5058.
Grubbs, F. E. & Beck, G. (1972). Extension of sample sizes and
percentage points for significance tests of outlying observations.
Technometrics, 14(4), 847-854.
leave_one_out_cv ¶
leave_one_out_cv(discharge: Series, distribution: str = 'gev', return_periods: list[int] | None = None) -> dict
Leave-one-out cross-validation for flood frequency fits.
For each held-out year, the distribution is fitted on the remaining years and the held-out observation is compared against the fitted median (T = 2-year return level).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Annual maximum discharge series (with a |
required |
distribution
|
str
|
Distribution key: |
'gev'
|
return_periods
|
list[int]
|
Return periods used internally. Defaults to |
None
|
Returns:
| Type | Description |
|---|---|
dict
|
Keys: |
lmoments_from_sample ¶
Compute L-moments (L1–L4) and L-moment ratios (t3, t4) from a sample.
L-moments are linear combinations of probability weighted moments (PWMs).
Parameters: data: 1-D array of observations.
Returns:
Dictionary with keys L1, L2, L3, L4, t3, t4.
Raises: ValueError: If fewer than 4 observations are provided.
probability_plot_correlation ¶
Probability Plot Correlation Coefficient (PPCC).
Computes the Pearson correlation between the sorted observations and the corresponding theoretical quantiles. Values close to 1 indicate a good fit.
Parameters: data: Observed sample. distribution: Distribution name. params: Distribution parameters.
Returns: PPCC value (between 0 and 1 for reasonable fits).
regional_frequency_analysis ¶
regional_frequency_analysis(sites: dict[str, ndarray], return_periods: list[float] | None = None) -> RegionalResult
L-moment based regional frequency analysis (Hosking & Wallis method).
Steps: 1. Compute L-moments for each site. 2. Discordancy test (flag sites with unusual L-moments). 3. Heterogeneity measure (H < 1 → acceptably homogeneous). 4. Fit regional growth curve using weighted regional L-moments. 5. Combine with site-specific index flood for return levels.
Parameters: sites: Mapping of site identifier → annual maxima array. return_periods: Return periods in years. Defaults to standard set.
Returns:
A :class:RegionalResult.
Raises: ValueError: If fewer than 2 sites are provided.
select_pot_threshold ¶
Select optimal threshold for Peaks-Over-Threshold analysis.
Parameters:
data: Array of observations.
method: Selection method — "mean_residual", "percentile"
(95th percentile), or "sqrt_rule" (mean + 1.5 × std).
Returns: Optimal threshold value.
Raises: ValueError: If method is not recognised.
weighted_skew ¶
weighted_skew(station_skew: float, regional_skew: float, n: int, regional_mse: float = 0.302) -> float
Compute weighted skew per Bulletin 17C §5.2.4.
Combines the station skew Gs with a generalised / regional skew
Gr using inverse-variance weights:
.. math::
G_w = w_1 G_s + w_2 G_r
where w_1 = MSE_r / (MSE_s + MSE_r) and w_2 = 1 − w_1.
Parameters:
station_skew: Station skew coefficient Gs.
regional_skew: Regional / generalised skew Gr.
n: Number of annual maximum observations.
regional_mse: Mean-square error of the regional skew estimate.
Default 0.302 is the USGS nationwide value from Bulletin 17C.
Returns:
Weighted skew coefficient Gw.
References: England, J. F. Jr. et al. (2018). Guidelines for determining flood flow frequency — Bulletin 17C. USGS TM 4-B5. https://doi.org/10.3133/tm4B5
flow_duration_curve ¶
Compute a flow duration curve.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Time-series of discharge values (any DatetimeIndex). |
required |
percentiles
|
list[float] | None
|
Exceedance percentiles to extract. Defaults to
|
None
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
class:`FDCResult` containing sorted discharges and extracted
|
|
percentile values.
|
|
low_flow_stat ¶
Compute nQm low-flow statistic (e.g. 7Q10).
The nQm is the minimum n-day rolling average that occurs with a return period of m years, estimated using the Weibull plotting position.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Daily discharge series with a DatetimeIndex. |
required |
n_day
|
int
|
Rolling window size in days. |
7
|
return_period
|
int
|
Return period in years. |
10
|
Returns:
| Type | Description |
|---|---|
The estimated nQm value in the same units as the input discharge.
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If there are fewer than 3 complete water years. |
cross_validate_rating ¶
K-fold cross-validation of a rating curve fit.
Parameters: stage: Stage measurements. discharge: Discharge measurements. k_folds: Number of folds (default 5).
Returns:
Dict with 'mean_rmse', 'std_rmse', 'mean_r2', and
'fold_results' (list of per-fold dicts).
Raises: ValueError: If inputs are invalid.
detect_rating_shift ¶
detect_rating_shift(stage: ndarray, discharge: ndarray, timestamps: ndarray | DatetimeIndex, window_size: int = 20) -> list[dict]
Detect temporal shifts in the stage-discharge relationship.
Fits rolling-window rating curves and compares the residual variance of successive windows using a chi-squared test. Significant changes are flagged as rating shifts.
Parameters: stage: Stage measurements. discharge: Corresponding discharge measurements. timestamps: Observation timestamps. window_size: Number of observations per rolling window.
Returns:
List of dicts, each containing 'timestamp', 'shift_magnitude',
and 'p_value'.
Raises: ValueError: If inputs are invalid.
export_hec_ras ¶
Export a rating curve to HEC-RAS compatible format.
Writes a simple table of stage-discharge pairs at regular intervals spanning the fitted stage range.
Parameters:
result: Fitted :class:RatingCurveResult.
filepath: Output file path.
fit_rating_curve ¶
Fit a power-law rating curve Q = a * (H - H₀)^b.
If h0 is None, it is optimised together with a and b using
:func:scipy.optimize.curve_fit.
Parameters:
stage: Water level (stage) measurements.
discharge: Corresponding discharge measurements.
h0: Optional fixed stage offset. If None, estimated from data.
Returns:
:class:RatingCurveResult with fitted parameters and diagnostics.
Raises: ValueError: If fewer than 5 stage-discharge pairs are provided, discharge contains negative values, or arrays contain NaN.
fit_segmented_rating_curve ¶
fit_segmented_rating_curve(stage: ndarray, discharge: ndarray, n_segments: int = 2, breakpoints: list[float] | None = None) -> RatingCurveResult
Fit a multi-segment rating curve with breakpoints.
Each segment is fitted independently as a power-law. If breakpoints are not provided, optimal breakpoints are found by minimising total RMSE over a grid of candidate values.
Parameters:
stage: Water level (stage) measurements.
discharge: Corresponding discharge measurements.
n_segments: Number of segments (default 2).
breakpoints: Explicit breakpoint stage values. Length must equal
n_segments - 1. If None, breakpoints are optimised.
Returns:
:class:RatingCurveResult with per-segment parameters in segments.
Raises: ValueError: If inputs are invalid or segments cannot be fitted.
predict_discharge ¶
Predict discharge from stage using a fitted rating curve.
For segmented curves the appropriate segment is selected for each stage value.
Parameters:
result: Fitted :class:RatingCurveResult.
stage: Stage values to predict at.
Returns: Predicted discharge array.
predict_stage ¶
Inverse prediction: compute stage from discharge.
Solves H = (Q / a)^(1/b) + H₀ for the primary (or first) segment.
For segmented curves the first segment whose discharge range contains the
query value is used.
Parameters:
result: Fitted :class:RatingCurveResult.
discharge: Discharge values.
Returns: Predicted stage array.
rating_curve_uncertainty ¶
rating_curve_uncertainty(result: RatingCurveResult, stage: ndarray, confidence: float = 0.95) -> tuple[np.ndarray, np.ndarray]
Compute prediction intervals for discharge estimates.
Uses a residual-based approach: the standard error of the residuals is scaled by the appropriate t-distribution quantile.
Parameters:
result: Fitted :class:RatingCurveResult.
stage: Stage values at which to compute intervals.
confidence: Confidence level (default 0.95).
Returns: Tuple of (lower_bound, upper_bound) discharge arrays.
fit_master_recession ¶
Fit a master recession curve to the identified segments.
Uses least-squares fitting of ln(Q/Q₀) vs time to estimate the recession constant k in the exponential model Q(t) = Q₀·e^(−t/k).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
segments
|
list[RecessionSegment]
|
Recession segments from :func: |
required |
Returns:
| Name | Type | Description |
|---|---|---|
A |
class:`RecessionResult` with the fitted recession constant and
|
|
goodness-of-fit metrics.
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If no segments are provided. |
identify_recessions ¶
identify_recessions(discharge: Series, *, min_length: int = 5, min_decline_pct: float = 0.05) -> list[RecessionSegment]
Identify recession segments in a daily discharge series.
A recession is a continuous period where each day's discharge is less than the previous day's. Very short segments or those with negligible total decline are excluded.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Daily discharge series with a DatetimeIndex. |
required |
min_length
|
int
|
Minimum segment length in days. |
5
|
min_decline_pct
|
float
|
Minimum total decline as a fraction of the starting value. |
0.05
|
Returns:
| Type | Description |
|---|---|
List of :class:`RecessionSegment` instances.
|
|
recession_analysis ¶
recession_analysis(discharge: Series, *, min_length: int = 5, min_decline_pct: float = 0.05) -> RecessionResult
Run full recession analysis: identify segments + fit MRC.
Convenience function combining :func:identify_recessions and
:func:fit_master_recession.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Daily discharge series with a DatetimeIndex. |
required |
min_length
|
int
|
Minimum recession segment length in days. |
5
|
min_decline_pct
|
float
|
Minimum total decline fraction. |
0.05
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
class:`RecessionResult`.
|
|
baseflow_index_simple ¶
Quick BFI using the Lyne–Hollick 1-pass digital filter.
Uses alpha=0.925 and a single forward pass for speed. For a more
robust estimate use :func:aquascope.hydrology.baseflow.lyne_hollick
directly with multiple passes.
Parameters: discharge: Daily discharge time series with DatetimeIndex.
Returns: Baseflow index (0–1).
compare_signatures ¶
Compare two signature reports field-by-field.
Parameters:
sig1: First :class:SignatureReport.
sig2: Second :class:SignatureReport.
Returns:
Dictionary of {field_name: absolute_percent_difference} for every
numeric field. Fields that are None in either report are skipped.
compute_signatures ¶
compute_signatures(discharge: Series, precipitation: Series | None = None, area_km2: float | None = None) -> SignatureReport
Compute comprehensive hydrological signatures from daily streamflow.
Parameters: discharge: Daily discharge time series (pd.Series with DatetimeIndex). Must contain at least 365 non-NaN values. precipitation: Optional daily precipitation series aligned with discharge. When provided, runoff ratio and elasticity are calculated. area_km2: Optional catchment area in km². Currently reserved for future unit-conversion but not required for any signature.
Returns:
:class:SignatureReport with all computed signatures.
Raises: ValueError: If discharge has fewer than 365 non-NaN values.
flashiness_index ¶
Richards-Baker Flashiness Index.
.. math:: FI = \frac{\sum |Q_i - Q_{i-1}|}{\sum Q_i}
Higher values indicate a more flashy/responsive catchment.
Parameters: discharge: Daily discharge time series with DatetimeIndex.
Returns: Flashiness index (dimensionless, ≥ 0).
flow_elasticity ¶
Sankarasubramanian precipitation-streamflow elasticity.
Computed year-by-year as:
.. math:: E = \text{median}\left(\frac{dQ / \bar{Q}}{dP / \bar{P}}\right)
where dQ and dP are annual departures from the long-term mean. An elasticity > 1 means streamflow is proportionally more variable than precipitation.
Parameters: discharge: Daily discharge time series with DatetimeIndex. precipitation: Daily precipitation with the same DatetimeIndex.
Returns: Elasticity coefficient (dimensionless).
Raises: ValueError: If fewer than 2 complete years are available.
recession_constant ¶
Mean recession constant from all recession segments.
A recession segment is a run of consecutive days where discharge
decreases. For each segment of at least min_length days the
exponential decay rate k is estimated by linear regression of
ln(Q) against time. The median k across all segments is
returned.
Parameters: discharge: Daily discharge time series with DatetimeIndex. min_length: Minimum number of consecutive falling days to qualify as a recession segment.
Returns: Median recession constant k (day⁻¹, positive).
seasonality_index ¶
Markham's seasonality index and concentration month.
Monthly mean flows are treated as vectors with direction equal to the month's angular position on a unit circle. The resultant's magnitude (normalised) gives the seasonality index and its direction gives the peak month.
Parameters: discharge: Daily discharge time series with DatetimeIndex.
Returns:
Tuple of (index, peak_month) where index ranges from 0
(uniform flow throughout the year) to 1 (all flow concentrated
in a single month), and peak_month is 1–12.
similarity_score ¶
similarity_score(sig1: SignatureReport, sig2: SignatureReport, weights: dict[str, float] | None = None) -> float
Compute overall similarity between two catchments.
The score is a weighted Euclidean distance in normalised signature space. A score of 0 means the reports are identical; higher values indicate greater dissimilarity.
Parameters:
sig1: First :class:SignatureReport.
sig2: Second :class:SignatureReport.
weights: Optional mapping of field name → weight. Fields not in
the dict receive a weight of 1.0. Defaults emphasise BFI,
flashiness, seasonality, and runoff ratio.
Returns: Weighted Euclidean distance (≥ 0).
Agricultural water management¶
FAO-56 Penman–Monteith, crop water requirements, soil water balance.
aquascope.agri ¶
Agricultural water management module.
Implements FAO-56 Penman-Monteith reference evapotranspiration, crop water requirements, and soil water balance modeling.
AgricultureBenchmarkResult
dataclass
¶
AgricultureBenchmarkResult(metric_id: str, metric_name: str, output_unit: str, summary: str, table: DataFrame)
Structured result from an AQUASTAT benchmarking workflow.
IrrigationPlan
dataclass
¶
IrrigationPlan(crop: str, planting_date: date, season_end_date: date, efficiency: float, total_eto_mm: float, total_precipitation_mm: float, total_effective_rain_mm: float, total_etc_mm: float, total_net_irrigation_mm: float, total_gross_irrigation_mm: float, total_applied_irrigation_mm: float, irrigation_trigger_days: int, schedule: DataFrame, balance: DataFrame)
Structured result from a crop irrigation planning workflow.
WaPORProductivityResult
dataclass
¶
WaPORProductivityResult(metric_id: str, metric_name: str, output_unit: str, aggregate_value: float, summary: str, table: DataFrame, aquastat_context: list[AgricultureBenchmarkResult] | None = None)
Structured result from a WaPOR productivity workflow.
SoilWaterBalance ¶
SoilWaterBalance(soil: SoilProperties, depletion_fraction: float = 0.5, initial_depletion: float = 0.0)
Daily soil water balance tracker.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
soil
|
SoilProperties
|
Soil hydraulic properties. |
required |
depletion_fraction
|
float
|
Fraction of TAW that can be depleted before stress (p, default 0.5). |
0.5
|
initial_depletion
|
float
|
Starting depletion in mm (default 0.0 = field capacity). |
0.0
|
References
Allen et al. (1998), FAO-56 Ch. 8. ISBN 92-5-104219-5.
step ¶
step(etc: float, precipitation: float = 0.0, irrigation: float = 0.0, runoff: float = 0.0) -> SoilWaterStatus
Advance the water balance by one day.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
etc
|
float
|
Crop evapotranspiration (mm/day). |
required |
precipitation
|
float
|
Daily precipitation (mm). |
0.0
|
irrigation
|
float
|
Applied irrigation (mm). |
0.0
|
runoff
|
float
|
Surface runoff (mm). |
0.0
|
Returns:
| Type | Description |
|---|---|
SoilWaterStatus
|
Updated soil water status. |
run ¶
run(etc_series: Series, precip_series: Series, irrigation_series: Series | None = None) -> pd.DataFrame
Run the water balance over a time series.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
etc_series
|
Series
|
Daily ETc (mm/day) with |
required |
precip_series
|
Series
|
Daily precipitation (mm) with |
required |
irrigation_series
|
Series | None
|
Daily irrigation (mm). If None, no irrigation is applied. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Daily soil water status records. |
auto_irrigate ¶
Run balance with automatic irrigation when depletion exceeds RAW.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
etc_series
|
Series
|
Daily ETc (mm/day). |
required |
precip_series
|
Series
|
Daily precipitation (mm). |
required |
efficiency
|
float
|
Irrigation system efficiency (0–1). |
0.7
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Daily water balance with |
benchmark_aquastat ¶
benchmark_aquastat(df: DataFrame, metric_id: str, *, year: int | None = None, countries: list[str] | None = None, latest_only: bool = True, top_n: int | None = None) -> AgricultureBenchmarkResult
Compute a country-scale benchmark from AQUASTAT data.
list_benchmark_metrics ¶
Return supported benchmark metric IDs.
crop_water_requirement ¶
crop_water_requirement(eto_series: Series, crop: str, planting_date: date, stage_lengths: dict[str, int] | None = None) -> pd.DataFrame
Compute daily crop water requirement over the growing season.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
eto_series
|
Series
|
Daily reference ET (mm/day) with a |
required |
crop
|
str
|
Crop name (key in |
required |
planting_date
|
date
|
Planting or sowing date. |
required |
stage_lengths
|
dict[str, int] | None
|
Days per stage. Defaults to |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Columns: |
References
Allen et al. (1998), FAO-56 Ch. 6. ISBN 92-5-104219-5.
get_kc ¶
Get crop coefficient(s) for a crop from FAO-56 Table 12.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
crop
|
str
|
Crop name (must be a key in |
required |
stage
|
str | None
|
Growth stage: |
None
|
Returns:
| Type | Description |
|---|---|
float | dict[str, float]
|
Kc value for the given stage, or a dict of all stages. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If crop or stage is unknown. |
References
Allen et al. (1998), Table 12. ISBN 92-5-104219-5.
irrigation_schedule ¶
irrigation_schedule(eto_series: Series, precip_series: Series, crop: str, planting_date: date, efficiency: float = 0.7, stage_lengths: dict[str, int] | None = None) -> pd.DataFrame
Full irrigation scheduling over the growing season.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
eto_series
|
Series
|
Daily reference ET (mm/day) with a |
required |
precip_series
|
Series
|
Daily precipitation (mm) with a |
required |
crop
|
str
|
Crop name. |
required |
planting_date
|
date
|
Planting date. |
required |
efficiency
|
float
|
Irrigation system efficiency (0–1). |
0.7
|
stage_lengths
|
dict[str, int] | None
|
Days per stage. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Columns: |
References
Allen et al. (1998), FAO-56 Ch. 7. ISBN 92-5-104219-5.
hargreaves ¶
Hargreaves reference ET₀ estimate (temperature-based).
A simpler alternative when only temperature data is available::
ET₀ = 0.0023 × (T_mean + 17.8) × (T_max − T_min)^0.5 × Ra
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
t_min
|
float
|
Minimum daily temperature (°C). |
required |
t_max
|
float
|
Maximum daily temperature (°C). |
required |
ra
|
float
|
Extraterrestrial radiation (MJ/m²/day). Use |
required |
Returns:
| Type | Description |
|---|---|
float
|
Reference evapotranspiration ET₀ in mm/day. |
References
Hargreaves, G. H. & Samani, Z. A. (1985). Reference crop evapotranspiration from temperature. Applied Engineering in Agriculture, 1(2), 96–99.
penman_monteith_daily ¶
penman_monteith_daily(t_min: float, t_max: float, rh_min: float, rh_max: float, u2: float, rs: float, latitude: float, elevation: float, doy: int) -> float
FAO-56 Penman-Monteith daily reference evapotranspiration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
t_min
|
float
|
Minimum daily temperature (°C). |
required |
t_max
|
float
|
Maximum daily temperature (°C). |
required |
rh_min
|
float
|
Minimum relative humidity (%). |
required |
rh_max
|
float
|
Maximum relative humidity (%). |
required |
u2
|
float
|
Wind speed at 2 m height (m/s). |
required |
rs
|
float
|
Incoming solar radiation (MJ/m²/day). |
required |
latitude
|
float
|
Latitude in decimal degrees. |
required |
elevation
|
float
|
Station elevation above sea level (m). |
required |
doy
|
int
|
Day of the year (1–366). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Reference evapotranspiration ET₀ in mm/day. |
References
Allen et al. (1998), Eq. 6. ISBN 92-5-104219-5.
default_season_end_date ¶
default_season_end_date(crop: str, planting_date: date, stage_lengths: dict[str, int] | None = None) -> date
Return the default end date for a crop season.
fetch_openmeteo_plan_inputs ¶
fetch_openmeteo_plan_inputs(latitude: float, longitude: float, start_date: str, end_date: str) -> tuple[pd.Series, pd.Series]
Fetch daily ET0 and precipitation from Open-Meteo for planning.
plan_irrigation ¶
plan_irrigation(crop: str, planting_date: date, eto_series: Series, precip_series: Series, soil: SoilProperties, *, efficiency: float = 0.7, depletion_fraction: float = 0.5, initial_depletion: float = 0.0, stage_lengths: dict[str, int] | None = None) -> IrrigationPlan
Build a full irrigation plan from daily ET and precipitation series.
estimate_wapor_productivity ¶
estimate_wapor_productivity(*, metric_id: str, aeti_df: DataFrame | None = None, npp_df: DataFrame | None = None, ret_df: DataFrame | None = None, aquastat_df: DataFrame | None = None, aquastat_metrics: list[str] | None = None, aquastat_year: int | None = None, aquastat_countries: list[str] | None = None, aquastat_top_n: int | None = 10) -> WaPORProductivityResult
Compute a WaPOR productivity or ET performance metric.
list_productivity_metrics ¶
Return supported productivity metric IDs.
Visualization¶
aquascope.viz ¶
AquaScope visualisation module.
Provides publication-quality plots for water-quality analysis, hydrology,
forecasting, and spatial data. All functions lazily import matplotlib
so the module can be imported even when the viz optional dependency
group is not installed — an ImportError is raised only when a plot
function is actually called.
Quick start::
from aquascope.viz import plot_timeseries, plot_forecast, plot_fdc
plot_timeseries(df, title="Daily Discharge")
plot_forecast(observed=train, forecast=pred, save_path="forecast.png")
plot_fdc(discharge_series, save_path="fdc.svg")
diagnostic_panel ¶
diagnostic_panel(observed, distribution: str, params: tuple, result: FloodFreqResult | None = None, *, save_path: str | None = None) -> Figure
4-panel diagnostic: Q-Q, P-P, return level, density comparison.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observed
|
array - like
|
Observed data (e.g., annual maximum discharge). |
required |
distribution
|
str
|
Distribution name. |
required |
params
|
tuple
|
Distribution parameters from |
required |
result
|
FloodFreqResult
|
If provided, the return level panel is drawn using this result. Otherwise, the return level panel is replaced with a histogram. |
None
|
save_path
|
str
|
If given, save the composite figure to this path. |
None
|
Returns:
| Type | Description |
|---|---|
Figure
|
The matplotlib |
pp_plot ¶
pp_plot(observed, distribution: str, params: tuple, *, ax: Axes | None = None, save_path: str | None = None, title: str | None = None) -> Figure
Probability-Probability plot.
Compares empirical CDFs with the theoretical CDF of the fitted distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observed
|
array - like
|
Observed data (e.g., annual maximum discharge). |
required |
distribution
|
str
|
Distribution name: |
required |
params
|
tuple
|
Distribution parameters from |
required |
ax
|
Axes
|
Matplotlib axes to draw on. |
None
|
save_path
|
str
|
If given, save the figure to this path. |
None
|
title
|
str
|
Plot title. |
None
|
Returns:
| Type | Description |
|---|---|
Figure
|
The matplotlib |
qq_plot ¶
qq_plot(observed, distribution: str, params: tuple, *, ax: Axes | None = None, save_path: str | None = None, title: str | None = None) -> Figure
Quantile-Quantile plot comparing observed data against a fitted distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observed
|
array - like
|
Observed data (e.g., annual maximum discharge). |
required |
distribution
|
str
|
Distribution name: |
required |
params
|
tuple
|
Distribution parameters (shape, loc, scale) from |
required |
ax
|
Axes
|
Matplotlib axes to draw on. A new figure is created when |
None
|
save_path
|
str
|
If given, save the figure to this path. |
None
|
title
|
str
|
Plot title. Defaults to |
None
|
Returns:
| Type | Description |
|---|---|
Figure
|
The matplotlib |
return_level_plot ¶
return_level_plot(result: FloodFreqResult, *, ci: bool = True, ax: Axes | None = None, save_path: str | None = None) -> Figure
Return level plot with confidence intervals.
Shows return period (x-axis, log-scale) versus discharge (y-axis) with optional confidence-interval bands.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
FloodFreqResult
|
Result from a flood frequency fit (e.g., |
required |
ci
|
bool
|
Whether to draw the confidence-interval band (default |
True
|
ax
|
Axes
|
Matplotlib axes to draw on. |
None
|
save_path
|
str
|
If given, save the figure to this path. |
None
|
Returns:
| Type | Description |
|---|---|
Figure
|
The matplotlib |
plot_fdc ¶
plot_fdc(discharge: Series, *, title: str = 'Flow Duration Curve', ylabel: str = 'Discharge (m³/s)', log_scale: bool = True, percentiles: list[float] | None = None, figsize: tuple[float, float] = DEFAULT_FIGSIZE, save_path: str | None = None) -> Figure
Plot a flow duration curve.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
Series
|
Series of discharge values. |
required |
title
|
str
|
Axis labels. |
'Flow Duration Curve'
|
ylabel
|
str
|
Axis labels. |
'Flow Duration Curve'
|
log_scale
|
bool
|
If |
True
|
percentiles
|
list[float] | None
|
Exceedance percentiles to annotate (e.g. |
None
|
figsize
|
tuple[float, float]
|
Figure size. |
DEFAULT_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_hydrograph ¶
plot_hydrograph(discharge: DataFrame, *, total_col: str = 'discharge', baseflow_col: str | None = 'baseflow', precip_col: str | None = None, title: str = 'Hydrograph', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure
Plot a hydrograph with optional baseflow and precipitation overlay.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
discharge
|
DataFrame
|
DataFrame with DatetimeIndex and at least a total discharge column. |
required |
total_col
|
str
|
Column name for total discharge. |
'discharge'
|
baseflow_col
|
str | None
|
Column name for baseflow (shaded underneath). |
'baseflow'
|
precip_col
|
str | None
|
Column for inverted precipitation bars on a secondary y-axis. |
None
|
title
|
str
|
Plot title. |
'Hydrograph'
|
figsize
|
tuple[float, float]
|
Figure size. |
WIDE_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_return_periods ¶
plot_return_periods(return_periods: dict[int, float], *, observed_max: float | None = None, title: str = 'Flood Return Periods', ylabel: str = 'Discharge (m³/s)', figsize: tuple[float, float] = DEFAULT_FIGSIZE, save_path: str | None = None) -> Figure
Plot return period estimates with optional observed maximum.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
return_periods
|
dict[int, float]
|
Mapping of return period (years) to estimated discharge. |
required |
observed_max
|
float | None
|
If given, draw a horizontal line at the observed maximum. |
None
|
title
|
str
|
Axis labels. |
'Flood Return Periods'
|
ylabel
|
str
|
Axis labels. |
'Flood Return Periods'
|
figsize
|
tuple[float, float]
|
Figure size. |
DEFAULT_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_spi_timeline ¶
plot_spi_timeline(spi_df: DataFrame, *, spi_col: str = 'spi_3', title: str = 'SPI Drought Timeline', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure
Plot SPI values as a bar chart coloured by drought severity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spi_df
|
DataFrame
|
DataFrame with DatetimeIndex and SPI column(s). |
required |
spi_col
|
str
|
Which SPI column to plot. |
'spi_3'
|
title
|
str
|
Plot title. |
'SPI Drought Timeline'
|
figsize
|
tuple[float, float]
|
Figure size. |
WIDE_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_boxplot ¶
plot_boxplot(df: DataFrame, *, value_col: str = 'value', group_col: str = 'station_name', title: str = 'Distribution by Group', ylabel: str = 'Value', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure
Box plot of value_col grouped by group_col.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Long-format DataFrame with at least value_col and group_col. |
required |
value_col
|
str
|
Column containing measurement values. |
'value'
|
group_col
|
str
|
Column to group by (station, parameter, etc.). |
'station_name'
|
title
|
str
|
Axis labels. |
'Distribution by Group'
|
ylabel
|
str
|
Axis labels. |
'Distribution by Group'
|
figsize
|
tuple[float, float]
|
Figure size. |
WIDE_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_eda_summary ¶
plot_eda_summary(report, *, title: str = 'EDA Summary', figsize: tuple[float, float] = MULTI_FIGSIZE, save_path: str | None = None) -> Figure
Multi-panel summary of an EDAReport.
Panels: 1. Record count per parameter (bar) 2. Missing data (bar) 3. Outlier count (bar) 4. Value ranges (error bar: mean ± std)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
report
|
An |
required | |
title
|
str
|
Super-title. |
'EDA Summary'
|
figsize
|
tuple[float, float]
|
Figure size. |
MULTI_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_heatmap ¶
plot_heatmap(df: DataFrame, *, title: str = 'Correlation Heatmap', figsize: tuple[float, float] = (10, 8), cmap: str = 'RdYlBu_r', save_path: str | None = None) -> Figure
Heatmap of the correlation matrix of numeric columns.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with numeric columns. |
required |
title
|
str
|
Plot title. |
'Correlation Heatmap'
|
figsize
|
tuple[float, float]
|
Figure size. |
(10, 8)
|
cmap
|
str
|
Colour map name. |
'RdYlBu_r'
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_param_comparison ¶
plot_param_comparison(df: DataFrame, *, value_col: str = 'value', param_col: str = 'parameter', station_col: str = 'station_name', title: str = 'Parameter Comparison', figsize: tuple[float, float] = MULTI_FIGSIZE, save_path: str | None = None) -> Figure
Grid of box plots — one per parameter, grouped by station.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Long-format DataFrame with measurements. |
required |
value_col
|
str
|
Column with measurement values. |
'value'
|
param_col
|
str
|
Column with parameter names. |
'parameter'
|
station_col
|
str
|
Column with station names. |
'station_name'
|
title
|
str
|
Super-title. |
'Parameter Comparison'
|
figsize
|
tuple[float, float]
|
Figure size. |
MULTI_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_who_exceedances ¶
plot_who_exceedances(who_df: DataFrame, *, title: str = 'WHO Guideline Exceedances', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure
Horizontal bar chart of WHO guideline exceedance percentages.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
who_df
|
DataFrame
|
DataFrame returned by |
required |
title
|
str
|
Plot title. |
'WHO Guideline Exceedances'
|
figsize
|
tuple[float, float]
|
Figure size. |
WIDE_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_station_map ¶
plot_station_map(stations: DataFrame, *, lat_col: str = 'latitude', lon_col: str = 'longitude', label_col: str = 'station_name', value_col: str | None = None, colour_col: str | None = None, title: str = 'Station Map', save_path: str | None = None) -> object
Create an interactive Folium map of monitoring stations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stations
|
DataFrame
|
DataFrame with at least latitude/longitude columns. |
required |
lat_col
|
str
|
Column names for coordinates. |
'latitude'
|
lon_col
|
str
|
Column names for coordinates. |
'latitude'
|
label_col
|
str
|
Column used for popup labels. |
'station_name'
|
value_col
|
str | None
|
Optional column whose value is shown in tooltips. |
None
|
colour_col
|
str | None
|
Optional column for colour-coding markers (e.g. risk level).
Values are mapped through |
None
|
title
|
str
|
Map title (shown in popup header). |
'Station Map'
|
save_path
|
str | None
|
If provided, save the HTML map to this path. |
None
|
Returns:
| Type | Description |
|---|---|
A ``folium.Map`` object.
|
|
plot_station_scatter ¶
plot_station_scatter(stations: DataFrame, *, lat_col: str = 'latitude', lon_col: str = 'longitude', label_col: str = 'station_name', value_col: str | None = None, title: str = 'Station Locations', cmap: str = 'YlOrRd', figsize: tuple[float, float] = DEFAULT_FIGSIZE, save_path: str | None = None) -> Figure
Static scatter plot of station locations coloured by value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
stations
|
DataFrame
|
DataFrame with latitude/longitude and optional value column. |
required |
lat_col
|
str
|
Column names for coordinates. |
'latitude'
|
lon_col
|
str
|
Column names for coordinates. |
'latitude'
|
label_col
|
str
|
Column for point labels. |
'station_name'
|
value_col
|
str | None
|
If provided, colour-code by this column and add a colour bar. |
None
|
title
|
str
|
Plot title. |
'Station Locations'
|
cmap
|
str
|
Matplotlib colour map name (used when value_col is set). |
'YlOrRd'
|
figsize
|
tuple[float, float]
|
Figure size. |
DEFAULT_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
apply_aqua_style ¶
Apply the AquaScope matplotlib style globally.
Sets a clean, publication-friendly style with the AquaScope colour palette. Safe to call multiple times.
plot_forecast ¶
plot_forecast(observed: DataFrame | None = None, forecast: DataFrame | None = None, *, obs_col: str = 'value', pred_col: str = 'yhat', lower_col: str = 'yhat_lower', upper_col: str = 'yhat_upper', title: str = 'Forecast', ylabel: str = 'Value', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure
Plot observed data with forecast and confidence interval bands.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observed
|
DataFrame | None
|
Historical DataFrame (DatetimeIndex + value column). |
None
|
forecast
|
DataFrame | None
|
Forecast DataFrame with |
None
|
obs_col
|
str
|
Column name in observed. |
'value'
|
pred_col
|
str
|
Column names in forecast. |
'yhat'
|
lower_col
|
str
|
Column names in forecast. |
'yhat'
|
upper_col
|
str
|
Column names in forecast. |
'yhat'
|
title
|
str
|
Axis labels. |
'Forecast'
|
ylabel
|
str
|
Axis labels. |
'Forecast'
|
figsize
|
tuple[float, float]
|
Figure size. |
WIDE_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_multi_param ¶
plot_multi_param(df: DataFrame, *, columns: list[str] | None = None, title: str = 'Multi-Parameter Time Series', ylabel: str = 'Value', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure
Overlay multiple columns on the same axes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with DatetimeIndex and one or more numeric columns. |
required |
columns
|
list[str] | None
|
Subset of column names to plot. |
None
|
title
|
str
|
Axis labels. |
'Multi-Parameter Time Series'
|
ylabel
|
str
|
Axis labels. |
'Multi-Parameter Time Series'
|
figsize
|
tuple[float, float]
|
Figure size. |
WIDE_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_observed_vs_predicted ¶
plot_observed_vs_predicted(observed: Series, predicted: Series, *, metrics: dict | None = None, title: str = 'Observed vs Predicted', figsize: tuple[float, float] = (7, 7), save_path: str | None = None) -> Figure
Scatter plot of observed vs predicted values with 1:1 line.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observed
|
Series
|
Observed values. |
required |
predicted
|
Series
|
Predicted values (same index as observed). |
required |
metrics
|
dict | None
|
Optional dict of evaluation metrics (NSE, KGE, …) to annotate. |
None
|
title
|
str
|
Plot title. |
'Observed vs Predicted'
|
figsize
|
tuple[float, float]
|
Figure size. |
(7, 7)
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_residuals ¶
plot_residuals(observed: Series, predicted: Series, *, title: str = 'Residuals', figsize: tuple[float, float] = DEFAULT_FIGSIZE, save_path: str | None = None) -> Figure
Plot residuals (observed − predicted) over the index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observed
|
Series
|
Observed values. |
required |
predicted
|
Series
|
Predicted values. |
required |
title
|
str
|
Plot title. |
'Residuals'
|
figsize
|
tuple[float, float]
|
Figure size. |
DEFAULT_FIGSIZE
|
save_path
|
str | None
|
Optional save path. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|
plot_timeseries ¶
plot_timeseries(df: DataFrame, *, value_col: str = 'value', title: str = 'Time Series', ylabel: str = 'Value', xlabel: str = 'Date', colour: str | None = None, ax: Axes | None = None, figsize: tuple[float, float] = DEFAULT_FIGSIZE, save_path: str | None = None) -> Figure
Plot a single time-series from a DatetimeIndex DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame with a DatetimeIndex and at least one value column. |
required |
value_col
|
str
|
Column name containing the values to plot. |
'value'
|
title
|
str
|
Axis labels. |
'Time Series'
|
ylabel
|
str
|
Axis labels. |
'Time Series'
|
xlabel
|
str
|
Axis labels. |
'Time Series'
|
colour
|
str | None
|
Line colour. Defaults to AquaScope primary blue. |
None
|
ax
|
Axes | None
|
Optional pre-existing Axes to draw on. |
None
|
figsize
|
tuple[float, float]
|
Figure size if creating a new figure. |
DEFAULT_FIGSIZE
|
save_path
|
str | None
|
If provided, save figure to this path instead of showing. |
None
|
Returns:
| Type | Description |
|---|---|
The matplotlib Figure.
|
|