Skip to content

API reference

Auto-generated from the AquaScope source. Every public function, class, and method appears below, with its docstring rendered in NumPy style.

If you're looking for a guided introduction, start with Getting started or Features. This page is the exhaustive reference.


High-level API

The most common entry points live in aquascope.api:

aquascope.api

High-level convenience API for common hydrological analyses.

Provides one-liner functions wrapping AquaScope's lower-level modules. Designed for quick analyses in Jupyter notebooks and scripts.

Examples:

>>> from aquascope.api import flood_analysis, baseflow_analysis
>>> result = flood_analysis(daily_discharge, method="gev", return_periods=[10, 50, 100])
>>> bf = baseflow_analysis(daily_discharge, method="eckhardt")

flood_analysis

flood_analysis(discharge: Series, method: str = 'gev', return_periods: list[int] | None = None, ci_level: float = 0.9, regional_skew: float | None = None, **kwargs) -> FloodFreqResult

Fit a flood-frequency distribution and estimate return-period quantiles.

Parameters:

Name Type Description Default
discharge Series

Daily (or sub-daily) discharge time-series with a :class:~pandas.DatetimeIndex.

required
method str

Distribution to fit. One of "gev", "lp3", "gumbel", "gev_lmoments", or "gpd".

'gev'
return_periods list[int] | None

List of return periods (years) for which to estimate quantiles. Defaults to [2, 5, 10, 25, 50, 100].

None
ci_level float

Confidence level for bootstrap confidence intervals (GEV only).

0.9
regional_skew float | None

Optional regional skew coefficient (LP3 only).

None
**kwargs

Forwarded to the underlying fitting function.

{}

Returns:

Type Description
FloodFreqResult

Fitted distribution parameters, quantile estimates, and confidence intervals.

Raises:

Type Description
ValueError

If method is not one of the supported methods.

Source code in aquascope/api.py
def flood_analysis(
    discharge: pd.Series,
    method: str = "gev",
    return_periods: list[int] | None = None,
    ci_level: float = 0.90,
    regional_skew: float | None = None,
    **kwargs,
) -> FloodFreqResult:
    """Fit a flood-frequency distribution and estimate return-period quantiles.

    Parameters
    ----------
    discharge:
        Daily (or sub-daily) discharge time-series with a
        :class:`~pandas.DatetimeIndex`.
    method:
        Distribution to fit.  One of ``"gev"``, ``"lp3"``, ``"gumbel"``,
        ``"gev_lmoments"``, or ``"gpd"``.
    return_periods:
        List of return periods (years) for which to estimate quantiles.
        Defaults to ``[2, 5, 10, 25, 50, 100]``.
    ci_level:
        Confidence level for bootstrap confidence intervals (GEV only).
    regional_skew:
        Optional regional skew coefficient (LP3 only).
    **kwargs:
        Forwarded to the underlying fitting function.

    Returns
    -------
    FloodFreqResult
        Fitted distribution parameters, quantile estimates, and
        confidence intervals.

    Raises
    ------
    ValueError
        If *method* is not one of the supported methods.
    """
    from aquascope.hydrology.flood_frequency import (
        fit_gev,
        fit_gev_lmoments,
        fit_gpd,
        fit_gumbel,
        fit_lp3,
    )

    if method not in _FLOOD_METHODS:
        msg = f"Unknown flood-frequency method {method!r}. Choose from {sorted(_FLOOD_METHODS)}."
        raise ValueError(msg)

    if return_periods is None:
        return_periods = [2, 5, 10, 25, 50, 100]

    if method == "gev":
        return fit_gev(discharge, return_periods=return_periods, ci_level=ci_level, **kwargs)
    if method == "lp3":
        lp3_kwargs: dict = {"return_periods": return_periods, **kwargs}
        if regional_skew is not None:
            lp3_kwargs["regional_skew"] = regional_skew
        return fit_lp3(discharge, **lp3_kwargs)
    if method == "gumbel":
        return fit_gumbel(discharge, return_periods=return_periods, **kwargs)
    if method == "gev_lmoments":
        return fit_gev_lmoments(discharge, return_periods=return_periods, **kwargs)
    # gpd
    return fit_gpd(discharge, return_periods=return_periods, **kwargs)

baseflow_analysis

baseflow_analysis(discharge: Series, method: str = 'lyne_hollick', **kwargs) -> BaseflowResult

Separate baseflow from quickflow using a digital filter.

Parameters:

Name Type Description Default
discharge Series

Daily discharge time-series.

required
method str

"lyne_hollick" (recursive filter, default) or "eckhardt" (two-parameter filter).

'lyne_hollick'
**kwargs

Forwarded to the filter function (e.g. alpha, n_passes).

{}

Returns:

Type Description
BaseflowResult

DataFrame of total / baseflow / quickflow plus BFI.

Raises:

Type Description
ValueError

If method is not supported.

Source code in aquascope/api.py
def baseflow_analysis(
    discharge: pd.Series,
    method: str = "lyne_hollick",
    **kwargs,
) -> BaseflowResult:
    """Separate baseflow from quickflow using a digital filter.

    Parameters
    ----------
    discharge:
        Daily discharge time-series.
    method:
        ``"lyne_hollick"`` (recursive filter, default) or ``"eckhardt"``
        (two-parameter filter).
    **kwargs:
        Forwarded to the filter function (e.g. ``alpha``, ``n_passes``).

    Returns
    -------
    BaseflowResult
        DataFrame of total / baseflow / quickflow plus BFI.

    Raises
    ------
    ValueError
        If *method* is not supported.
    """
    from aquascope.hydrology.baseflow import eckhardt, lyne_hollick

    if method not in _BASEFLOW_METHODS:
        msg = f"Unknown baseflow method {method!r}. Choose from {sorted(_BASEFLOW_METHODS)}."
        raise ValueError(msg)

    if method == "lyne_hollick":
        return lyne_hollick(discharge, **kwargs)
    return eckhardt(discharge, **kwargs)

flow_duration

flow_duration(discharge: Series, **kwargs) -> FDCResult

Compute a flow-duration curve.

Parameters:

Name Type Description Default
discharge Series

Daily discharge time-series.

required
**kwargs

Forwarded to :func:~aquascope.hydrology.flow_duration.flow_duration_curve (e.g. percentiles).

{}

Returns:

Type Description
FDCResult

Exceedance probabilities, sorted discharges, and percentile values.

Source code in aquascope/api.py
def flow_duration(discharge: pd.Series, **kwargs) -> FDCResult:
    """Compute a flow-duration curve.

    Parameters
    ----------
    discharge:
        Daily discharge time-series.
    **kwargs:
        Forwarded to :func:`~aquascope.hydrology.flow_duration.flow_duration_curve`
        (e.g. ``percentiles``).

    Returns
    -------
    FDCResult
        Exceedance probabilities, sorted discharges, and percentile values.
    """
    from aquascope.hydrology.flow_duration import flow_duration_curve

    return flow_duration_curve(discharge, **kwargs)

compute_all_signatures

compute_all_signatures(discharge: Series, **kwargs) -> SignatureReport

Compute a comprehensive set of hydrological signatures.

Parameters:

Name Type Description Default
discharge Series

Daily discharge time-series.

required
**kwargs

Forwarded to :func:~aquascope.hydrology.signatures.compute_signatures (e.g. precipitation, area_km2).

{}

Returns:

Type Description
SignatureReport

Dataclass containing ~20 hydrological signature values.

Source code in aquascope/api.py
def compute_all_signatures(discharge: pd.Series, **kwargs) -> SignatureReport:
    """Compute a comprehensive set of hydrological signatures.

    Parameters
    ----------
    discharge:
        Daily discharge time-series.
    **kwargs:
        Forwarded to :func:`~aquascope.hydrology.signatures.compute_signatures`
        (e.g. ``precipitation``, ``area_km2``).

    Returns
    -------
    SignatureReport
        Dataclass containing ~20 hydrological signature values.
    """
    from aquascope.hydrology.signatures import compute_signatures

    return compute_signatures(discharge, **kwargs)

detect_changepoints

detect_changepoints(series: ndarray | Series, method: str = 'pelt', **kwargs) -> ChangePointResult

Detect abrupt shifts in a time-series.

Parameters:

Name Type Description Default
series ndarray | Series

One-dimensional numeric data.

required
method str

Detection algorithm. One of "pelt", "cusum", "binary_segmentation", or "pettitt".

'pelt'
**kwargs

Forwarded to the detection function.

{}

Returns:

Type Description
ChangePointResult

Detected change-points, segment summaries, and test statistics.

Raises:

Type Description
ValueError

If method is not supported.

Source code in aquascope/api.py
def detect_changepoints(
    series: np.ndarray | pd.Series,
    method: str = "pelt",
    **kwargs,
) -> ChangePointResult:
    """Detect abrupt shifts in a time-series.

    Parameters
    ----------
    series:
        One-dimensional numeric data.
    method:
        Detection algorithm.  One of ``"pelt"``, ``"cusum"``,
        ``"binary_segmentation"``, or ``"pettitt"``.
    **kwargs:
        Forwarded to the detection function.

    Returns
    -------
    ChangePointResult
        Detected change-points, segment summaries, and test statistics.

    Raises
    ------
    ValueError
        If *method* is not supported.
    """
    from aquascope.analysis.changepoint import (
        ChangePointResult as CPResult,
    )
    from aquascope.analysis.changepoint import (
        binary_segmentation,
        cusum,
        pelt,
        pettitt_test,
    )

    if method not in _CHANGEPOINT_METHODS:
        msg = f"Unknown changepoint method {method!r}. Choose from {sorted(_CHANGEPOINT_METHODS)}."
        raise ValueError(msg)

    if method == "pelt":
        return pelt(series, **kwargs)
    if method == "cusum":
        return cusum(series, **kwargs)
    if method == "binary_segmentation":
        return binary_segmentation(series, **kwargs)
    # pettitt — returns ChangePoint | None; wrap into ChangePointResult
    cp = pettitt_test(series, **kwargs)
    changepoints = [cp] if cp is not None else []
    return CPResult(
        changepoints=changepoints,
        n_changepoints=len(changepoints),
        method="pettitt",
        penalty=None,
        segments=[],
    )

fit_copula

fit_copula(x: ndarray | Series, y: ndarray | Series, family: str = 'auto', **kwargs) -> CopulaResult

Fit a bivariate copula to paired observations.

Parameters:

Name Type Description Default
x ndarray | Series

Paired data arrays of equal length.

required
y ndarray | Series

Paired data arrays of equal length.

required
family str

Copula family — "auto" (best AIC), "gaussian", "clayton", "gumbel", or "frank".

'auto'
**kwargs

Forwarded to :func:~aquascope.analysis.copulas.fit_copula.

{}

Returns:

Type Description
CopulaResult

Fitted copula parameters, dependence measures, and AIC.

Raises:

Type Description
ValueError

If family is not supported.

Source code in aquascope/api.py
def fit_copula(
    x: np.ndarray | pd.Series,
    y: np.ndarray | pd.Series,
    family: str = "auto",
    **kwargs,
) -> CopulaResult:
    """Fit a bivariate copula to paired observations.

    Parameters
    ----------
    x, y:
        Paired data arrays of equal length.
    family:
        Copula family — ``"auto"`` (best AIC), ``"gaussian"``,
        ``"clayton"``, ``"gumbel"``, or ``"frank"``.
    **kwargs:
        Forwarded to :func:`~aquascope.analysis.copulas.fit_copula`.

    Returns
    -------
    CopulaResult
        Fitted copula parameters, dependence measures, and AIC.

    Raises
    ------
    ValueError
        If *family* is not supported.
    """
    from aquascope.analysis.copulas import (
        compare_copulas,
        to_pseudo_observations,
    )
    from aquascope.analysis.copulas import (
        fit_copula as _fit_copula,
    )

    if family not in _COPULA_FAMILIES:
        msg = f"Unknown copula family {family!r}. Choose from {sorted(_COPULA_FAMILIES)}."
        raise ValueError(msg)

    u, v = to_pseudo_observations(x, y)

    if family == "auto":
        results = compare_copulas(u, v)
        return results[0]  # best AIC

    return _fit_copula(u, v, family=family, **kwargs)

bayesian_regression

bayesian_regression(X: ndarray | DataFrame, y: ndarray | Series, degree: int = 1, **kwargs) -> PosteriorResult

Fit a Bayesian linear or polynomial regression.

Parameters:

Name Type Description Default
X ndarray | DataFrame

Feature matrix (degree=1) or 1-D predictor (degree>1).

required
y ndarray | Series

Response variable.

required
degree int

Polynomial degree. 1 uses :class:~aquascope.models.bayesian.BayesianLinearRegression; higher values use :class:~aquascope.models.bayesian.BayesianPolynomialRegression.

1
**kwargs

Forwarded to the model constructor (e.g. prior_precision).

{}

Returns:

Type Description
PosteriorResult

Posterior summaries, credible intervals, and diagnostics.

Source code in aquascope/api.py
def bayesian_regression(
    X: np.ndarray | pd.DataFrame,  # noqa: N803
    y: np.ndarray | pd.Series,
    degree: int = 1,
    **kwargs,
) -> PosteriorResult:
    """Fit a Bayesian linear or polynomial regression.

    Parameters
    ----------
    X:
        Feature matrix (degree=1) or 1-D predictor (degree>1).
    y:
        Response variable.
    degree:
        Polynomial degree.  ``1`` uses
        :class:`~aquascope.models.bayesian.BayesianLinearRegression`;
        higher values use
        :class:`~aquascope.models.bayesian.BayesianPolynomialRegression`.
    **kwargs:
        Forwarded to the model constructor (e.g. ``prior_precision``).

    Returns
    -------
    PosteriorResult
        Posterior summaries, credible intervals, and diagnostics.
    """
    from aquascope.models.bayesian import BayesianLinearRegression, BayesianPolynomialRegression

    if degree == 1:
        model = BayesianLinearRegression(**kwargs)
        return model.fit(X, y)
    model = BayesianPolynomialRegression(degree=degree, **kwargs)
    return model.fit(X, y)

ensemble_forecast

ensemble_forecast(models: list[tuple[str, object]], X_train: DataFrame, y_train: Series, X_test: DataFrame, method: str = 'stacking', **kwargs) -> np.ndarray

Train an ensemble of models and return predictions on X_test.

Parameters:

Name Type Description Default
models list[tuple[str, object]]

List of (name, estimator) tuples.

required
X_train DataFrame

Training features.

required
y_train Series

Training target.

required
X_test DataFrame

Test features.

required
method str

Ensemble strategy — "weighted", "stacking", or "adaptive".

'stacking'
**kwargs

Forwarded to the ensemble constructor.

{}

Returns:

Type Description
ndarray

Predicted values for X_test.

Raises:

Type Description
ValueError

If method is not supported.

Source code in aquascope/api.py
def ensemble_forecast(
    models: list[tuple[str, object]],
    X_train: pd.DataFrame,  # noqa: N803
    y_train: pd.Series,
    X_test: pd.DataFrame,  # noqa: N803
    method: str = "stacking",
    **kwargs,
) -> np.ndarray:
    """Train an ensemble of models and return predictions on *X_test*.

    Parameters
    ----------
    models:
        List of ``(name, estimator)`` tuples.
    X_train:
        Training features.
    y_train:
        Training target.
    X_test:
        Test features.
    method:
        Ensemble strategy — ``"weighted"``, ``"stacking"``, or
        ``"adaptive"``.
    **kwargs:
        Forwarded to the ensemble constructor.

    Returns
    -------
    numpy.ndarray
        Predicted values for *X_test*.

    Raises
    ------
    ValueError
        If *method* is not supported.
    """
    from aquascope.models.ensemble import AdaptiveEnsemble, StackingEnsemble, WeightedEnsemble

    if method not in _ENSEMBLE_METHODS:
        msg = f"Unknown ensemble method {method!r}. Choose from {sorted(_ENSEMBLE_METHODS)}."
        raise ValueError(msg)

    if method == "weighted":
        ens = WeightedEnsemble(models, **kwargs)
        ens.fit(X_train, y_train)
        return ens.predict(X_test).predictions
    if method == "stacking":
        ens = StackingEnsemble(models, **kwargs)
        ens.fit(X_train, y_train)
        return ens.predict(X_test).predictions
    # adaptive
    ens = AdaptiveEnsemble(models, **kwargs)
    ens.fit(X_train, y_train)
    return ens.update_and_predict(X_test).predictions

generate_report

generate_report(title: str, **kwargs) -> ReportBuilder

Create a pre-configured :class:~aquascope.reporting.builder.ReportBuilder.

Parameters:

Name Type Description Default
title str

Report title.

required
**kwargs

Forwarded to :class:~aquascope.reporting.builder.ReportBuilder (e.g. author, description).

{}

Returns:

Type Description
ReportBuilder

A builder instance ready for method-chaining.

Source code in aquascope/api.py
def generate_report(title: str, **kwargs) -> ReportBuilder:
    """Create a pre-configured :class:`~aquascope.reporting.builder.ReportBuilder`.

    Parameters
    ----------
    title:
        Report title.
    **kwargs:
        Forwarded to :class:`~aquascope.reporting.builder.ReportBuilder`
        (e.g. ``author``, ``description``).

    Returns
    -------
    ReportBuilder
        A builder instance ready for method-chaining.
    """
    from aquascope.reporting.builder import ReportBuilder

    return ReportBuilder(title, **kwargs)

groundwater_analysis

groundwater_analysis(levels: Series, method: str = 'trend', **kwargs) -> dict

Run a groundwater analysis on a water-level time series.

Parameters:

Name Type Description Default
levels Series

Water-level measurements with :class:~pandas.DatetimeIndex.

required
method str

Analysis type — "trend" (Mann-Kendall trend detection), "recession" (aquifer recession), "seasonal" (decomposition), or "hydrograph" (full hydrograph summary).

'trend'
**kwargs

Forwarded to the underlying function.

{}

Returns:

Type Description
dict

Result dataclass from the chosen analysis, accessed as a dict or the original dataclass depending on method.

Raises:

Type Description
ValueError

If method is not supported.

Source code in aquascope/api.py
def groundwater_analysis(
    levels: pd.Series,
    method: str = "trend",
    **kwargs,
) -> dict:
    """Run a groundwater analysis on a water-level time series.

    Parameters
    ----------
    levels:
        Water-level measurements with :class:`~pandas.DatetimeIndex`.
    method:
        Analysis type — ``"trend"`` (Mann-Kendall trend detection),
        ``"recession"`` (aquifer recession), ``"seasonal"`` (decomposition),
        or ``"hydrograph"`` (full hydrograph summary).
    **kwargs:
        Forwarded to the underlying function.

    Returns
    -------
    dict
        Result dataclass from the chosen analysis, accessed as a dict
        or the original dataclass depending on method.

    Raises
    ------
    ValueError
        If *method* is not supported.
    """
    _gw_methods = {"trend", "recession", "seasonal", "hydrograph"}
    if method not in _gw_methods:
        msg = f"Unknown groundwater method {method!r}. Choose from {sorted(_gw_methods)}."
        raise ValueError(msg)

    if method == "trend":
        from aquascope.groundwater.wells import trend_detection
        return trend_detection(levels, **kwargs)
    if method == "recession":
        from aquascope.groundwater.wells import recession_analysis
        return recession_analysis(levels, **kwargs)
    if method == "seasonal":
        from aquascope.groundwater.wells import seasonal_decomposition
        return seasonal_decomposition(levels, **kwargs)
    # hydrograph
    from aquascope.groundwater.wells import well_hydrograph
    return well_hydrograph(levels, **kwargs)

climate_downscale

climate_downscale(obs: Series, gcm_hist: Series, gcm_future: Series, method: str = 'quantile_mapping', **kwargs) -> pd.Series

Downscale a GCM projection using statistical bias correction.

Parameters:

Name Type Description Default
obs Series

Observed station data.

required
gcm_hist Series

GCM historical simulation (overlapping period with obs).

required
gcm_future Series

GCM future projection to downscale.

required
method str

Downscaling method — "delta" (additive/multiplicative), "quantile_mapping", or "qdm" (Quantile Delta Mapping).

'quantile_mapping'
**kwargs

Forwarded to the underlying function.

{}

Returns:

Type Description
Series

Bias-corrected future projection.

Raises:

Type Description
ValueError

If method is not supported.

Source code in aquascope/api.py
def climate_downscale(
    obs: pd.Series,
    gcm_hist: pd.Series,
    gcm_future: pd.Series,
    method: str = "quantile_mapping",
    **kwargs,
) -> pd.Series:
    """Downscale a GCM projection using statistical bias correction.

    Parameters
    ----------
    obs:
        Observed station data.
    gcm_hist:
        GCM historical simulation (overlapping period with *obs*).
    gcm_future:
        GCM future projection to downscale.
    method:
        Downscaling method — ``"delta"`` (additive/multiplicative),
        ``"quantile_mapping"``, or ``"qdm"`` (Quantile Delta Mapping).
    **kwargs:
        Forwarded to the underlying function.

    Returns
    -------
    pandas.Series
        Bias-corrected future projection.

    Raises
    ------
    ValueError
        If *method* is not supported.
    """
    _ds_methods = {"delta", "quantile_mapping", "qdm"}
    if method not in _ds_methods:
        msg = f"Unknown downscaling method {method!r}. Choose from {sorted(_ds_methods)}."
        raise ValueError(msg)

    if method == "delta":
        from aquascope.climate.downscaling import delta_method
        return delta_method(obs, gcm_hist, gcm_future, **kwargs)
    if method == "quantile_mapping":
        from aquascope.climate.downscaling import quantile_mapping
        return quantile_mapping(obs, gcm_hist, gcm_future, **kwargs)
    # qdm
    from aquascope.climate.downscaling import quantile_delta_mapping
    return quantile_delta_mapping(obs, gcm_hist, gcm_future, **kwargs)

climate_indices

climate_indices(precip: Series | None = None, temperature: Series | None = None, pet: Series | None = None, index: str = 'cdd', **kwargs) -> object

Compute a climate index from meteorological data.

Parameters:

Name Type Description Default
precip Series | None

Precipitation series (required for "cdd", "cwd", "pci", "drought", "pdsi").

None
temperature Series | None

Maximum temperature series (required for "heat_wave").

None
pet Series | None

Potential evapotranspiration (required for "pdsi", "aridity").

None
index str

Index to compute — "cdd" (consecutive dry days), "cwd" (consecutive wet days), "pci" (precipitation concentration), "heat_wave", "aridity", "pdsi".

'cdd'
**kwargs

Forwarded to the underlying function.

{}

Returns:

Type Description
object

Result dataclass or value from the chosen index.

Raises:

Type Description
ValueError

If index is not supported or required input is missing.

Source code in aquascope/api.py
def climate_indices(
    precip: pd.Series | None = None,
    temperature: pd.Series | None = None,
    pet: pd.Series | None = None,
    index: str = "cdd",
    **kwargs,
) -> object:
    """Compute a climate index from meteorological data.

    Parameters
    ----------
    precip:
        Precipitation series (required for ``"cdd"``, ``"cwd"``, ``"pci"``,
        ``"drought"``, ``"pdsi"``).
    temperature:
        Maximum temperature series (required for ``"heat_wave"``).
    pet:
        Potential evapotranspiration (required for ``"pdsi"``, ``"aridity"``).
    index:
        Index to compute — ``"cdd"`` (consecutive dry days),
        ``"cwd"`` (consecutive wet days), ``"pci"`` (precipitation
        concentration), ``"heat_wave"``, ``"aridity"``, ``"pdsi"``.
    **kwargs:
        Forwarded to the underlying function.

    Returns
    -------
    object
        Result dataclass or value from the chosen index.

    Raises
    ------
    ValueError
        If *index* is not supported or required input is missing.
    """
    _idx_names = {"cdd", "cwd", "pci", "heat_wave", "aridity", "pdsi"}
    if index not in _idx_names:
        msg = f"Unknown climate index {index!r}. Choose from {sorted(_idx_names)}."
        raise ValueError(msg)

    if index == "cdd":
        from aquascope.climate.indices import consecutive_dry_days
        if precip is None:
            raise ValueError("'precip' is required for CDD index.")
        return consecutive_dry_days(precip, **kwargs)
    if index == "cwd":
        from aquascope.climate.indices import consecutive_wet_days
        if precip is None:
            raise ValueError("'precip' is required for CWD index.")
        return consecutive_wet_days(precip, **kwargs)
    if index == "pci":
        from aquascope.climate.indices import precipitation_concentration_index
        if precip is None:
            raise ValueError("'precip' is required for PCI index.")
        return precipitation_concentration_index(precip, **kwargs)
    if index == "heat_wave":
        from aquascope.climate.indices import heat_wave_index
        if temperature is None:
            raise ValueError("'temperature' is required for heat_wave index.")
        return heat_wave_index(temperature, **kwargs)
    if index == "aridity":
        from aquascope.climate.indices import aridity_index
        if precip is None or pet is None:
            raise ValueError("'precip' and 'pet' are required for aridity index.")
        return aridity_index(float(precip.sum()), float(pet.sum()), **kwargs)
    # pdsi
    from aquascope.climate.indices import palmer_drought_severity_index
    if precip is None or pet is None:
        raise ValueError("'precip' and 'pet' are required for PDSI.")
    return palmer_drought_severity_index(precip, pet, **kwargs)

Data collectors

12 unified collectors. Every collector returns records in the same Pydantic schema.

aquascope.collectors

Data collectors for Taiwan and global water data sources.

AquastatCollector

AquastatCollector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect country-level water data from FAO AQUASTAT.

Parameters:

Name Type Description Default
client CachedHTTPClient | None

HTTP client instance. A default is created if None.

None
References

FAO. (2023). AQUASTAT. https://www.fao.org/aquastat/

fetch_raw

fetch_raw(country_code: str = 'all', variable_ids: list[int] | None = None, start_year: int = 2000, end_year: int = 2023, **kwargs: Any) -> list[dict]

Fetch raw AQUASTAT data from the FAOSTAT API.

Parameters:

Name Type Description Default
country_code str

ISO3 country code, or 'all' for global data.

'all'
variable_ids list[int] | None

AQUASTAT variable IDs. Defaults to all key variables.

None
start_year int

Start year (default 2000).

2000
end_year int

End year (default 2023).

2023

Returns:

Type Description
list[dict]

Raw API response records.

normalise

normalise(raw: list[dict]) -> Sequence[AquastatRecord]

Convert raw FAOSTAT response into AquastatRecord objects.

Parameters:

Name Type Description Default
raw list[dict]

Records from fetch_raw.

required

Returns:

Type Description
Sequence[AquastatRecord]

Normalised AQUASTAT records.

BaseCollector

BaseCollector(client: CachedHTTPClient | None = None)

Bases: ABC

Every collector must implement fetch_raw and normalise.

The public entry-point is collect() which chains those two steps.

fetch_raw abstractmethod

fetch_raw(**kwargs) -> Any

Fetch raw data from the upstream API.

normalise abstractmethod

normalise(raw: Any) -> Sequence[BaseModel]

Convert raw API response into unified Pydantic records.

collect

collect(**kwargs) -> Sequence[BaseModel]

Fetch + normalise in one call.

CopernicusCollector

CopernicusCollector(dataset: str | None = None, **kwargs)

Bases: BaseCollector

Fetch GloFAS river-discharge data from Copernicus CDS.

Parameters:

Name Type Description Default
dataset str

CDS dataset ID. Default is the GloFAS historical dataset.

None
Example

collector = CopernicusCollector() records = collector.collect( ... latitude=48.85, longitude=2.35, ... year="2023", month=["01", "02", "03"], ... )

fetch_raw

fetch_raw(*, latitude: float, longitude: float, year: str | list[str] = '2023', month: str | list[str] = '01', day: str | list[str] | None = None, variable: str = 'river_discharge_in_the_last_24_hours', product_type: str = 'consolidated', system_version: str = 'version_4_0') -> list[dict]

Fetch data via CDS API and return parsed records.

Parameters:

Name Type Description Default
latitude float

Site coordinates.

required
longitude float

Site coordinates.

required
year str | list[str]

Temporal selection.

'2023'
month str | list[str]

Temporal selection.

'2023'
day str | list[str]

Temporal selection.

'2023'
variable str

CDS variable name.

'river_discharge_in_the_last_24_hours'
product_type str

CDS-specific dataset options.

'consolidated'
system_version str

CDS-specific dataset options.

'consolidated'

normalise

normalise(raw: list[dict]) -> Sequence[BaseModel]

Convert parsed GRIB records into WaterQualitySample objects.

EUWFDCollector

EUWFDCollector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect water quality data from the EEA WISE SoE Waterbase.

Uses the DiscoData SQL API published by the European Environment Agency.

fetch_raw

fetch_raw(country: str | None = None, water_body_type: str = 'river', year: int | None = None, **kwargs) -> list[dict]

Fetch raw water quality records from the EEA DiscoData API.

Parameters: country: ISO-2 country code (e.g. "DE", "FR"). water_body_type: "river", "lake", or "groundwater". year: Calendar year to filter on.

Returns: List of raw record dicts from the API response.

normalise

normalise(raw: list[dict]) -> Sequence[WaterQualitySample]

Convert raw EEA records into unified WaterQualitySample objects.

Parameters: raw: List of dicts from fetch_raw.

Returns: Sequence of WaterQualitySample instances.

GEMStatCollector

GEMStatCollector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect water quality data from the GEMStat Zenodo archive.

Downloads the CSV file, parses rows, and normalises into WaterQualitySample records. Supports filtering by country.

fetch_raw

fetch_raw(country: str | None = None, max_records: int = 5000, parameters: list[str] | None = None, start_date: str | None = None, end_date: str | None = None, **kwargs) -> list[dict]

Download the GEMStat Zenodo archive (once), then join station metadata with observation rows and return filtered results.

The ZIP (~200 MB) is cached to data/cache/ on first call; subsequent calls load from the local file and are fast.

Parameters:

Name Type Description Default
country str

Full or partial country name (e.g. "Germany", "Canada"). Case-insensitive substring match against the station metadata. GEMStat covers ~42 countries — Taiwan is not included.

None
max_records int

Hard cap on returned rows across all parameters (default 5 000).

5000
parameters list[str]

Parameter CSV names without .csv (e.g. ["pH", "Temperature"]). Defaults to :attr:DEFAULT_PARAMETERS.

None
start_date str

ISO date "YYYY-MM-DD" — only include rows on or after this date.

None
end_date str

ISO date "YYYY-MM-DD" — only include rows on or before this date.

None

parse_gemstat_csv staticmethod

parse_gemstat_csv(csv_content: str, max_records: int = 10000) -> list[WaterQualitySample]

Parse a GEMStat CSV string into WaterQualitySample records.

Expected columns: GEMS Station Number, Sample Date, Parameter, Analysis Result, Unit, Latitude, Longitude, Country Code, etc.

JapanMLITCollector

JapanMLITCollector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect water data from Japan MLIT Water Information System.

Supports water-level, discharge, water-quality, and rainfall observations. Results are normalised to WaterQualitySample records with source = DataSource.JAPAN_MLIT.

fetch_raw

fetch_raw(station_id: str | None = None, prefecture: str | None = None, parameter: str = 'water_level', start_date: str | None = None, end_date: str | None = None, **kwargs: Any) -> list[dict]

Fetch raw observation data from the MLIT Water Information System.

Parameters:

Name Type Description Default
station_id str | None

MLIT station code (e.g. "305041281005030").

None
prefecture str | None

Prefecture name in English (e.g. "Tokyo"). Mapped to a numeric code via PREFECTURE_CODES.

None
parameter str

One of water_level, discharge, water_quality, rainfall.

'water_level'
start_date str | None

Start date in ISO format (YYYY-MM-DD).

None
end_date str | None

End date in ISO format (YYYY-MM-DD).

None
**kwargs Any

Additional keyword arguments forwarded to the HTTP request.

{}

Returns:

Type Description
list[dict]

Raw observation records. Returns an empty list when the upstream API is unreachable or no data matches.

Raises:

Type Description
ValueError

If parameter is not one of the supported types.

normalise

normalise(raw: list[dict]) -> list[WaterQualitySample]

Normalise raw MLIT data into WaterQualitySample records.

Parameters:

Name Type Description Default
raw list[dict]

Raw records as returned by fetch_raw.

required

Returns:

Type Description
list[WaterQualitySample]

Unified water-quality sample records.

KoreaWAMISCollector

KoreaWAMISCollector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect water data from Korea WAMIS Open API.

Supports water-level, discharge, water-quality, and dam-storage observations. Results are normalised to WaterQualitySample records with source = DataSource.KOREA_WAMIS.

fetch_raw

fetch_raw(station_id: str | None = None, basin: str | None = None, parameter: str = 'water_level', start_date: str | None = None, end_date: str | None = None, **kwargs: Any) -> list[dict]

Fetch raw observation data from the WAMIS Open API.

Parameters:

Name Type Description Default
station_id str | None

WAMIS station code.

None
basin str | None

Basin name in English (e.g. "Han"). Mapped to the Korean name via KOREA_MAJOR_BASINS.

None
parameter str

One of water_level, discharge, water_quality, dam_storage.

'water_level'
start_date str | None

Start date in ISO format (YYYY-MM-DD).

None
end_date str | None

End date in ISO format (YYYY-MM-DD).

None
**kwargs Any

Additional keyword arguments forwarded to the HTTP request.

{}

Returns:

Type Description
list[dict]

Raw observation records. Returns an empty list when the upstream API is unreachable or no data matches.

Raises:

Type Description
ValueError

If parameter is not one of the supported types.

normalise

normalise(raw: list[dict]) -> list[WaterQualitySample]

Normalise raw WAMIS data into WaterQualitySample records.

Parameters:

Name Type Description Default
raw list[dict]

Raw records as returned by fetch_raw.

required

Returns:

Type Description
list[WaterQualitySample]

Unified water-quality sample records.

OpenMeteoCollector

OpenMeteoCollector(mode: str = 'weather', **kwargs)

Bases: BaseCollector

Fetch weather, climate reanalysis, and river-discharge data from Open-Meteo.

Parameters:

Name Type Description Default
mode str

"weather" (default), "forecast", or "flood" (GloFAS discharge).

'weather'
Example

collector = OpenMeteoCollector(mode="weather") records = collector.collect( ... latitude=25.03, longitude=121.57, ... start_date="2023-01-01", end_date="2023-12-31", ... daily=["temperature_2m_mean", "precipitation_sum"], ... )

fetch_raw

fetch_raw(*, latitude: float, longitude: float, start_date: str | None = None, end_date: str | None = None, daily: list[str] | None = None, hourly: list[str] | None = None, forecast_days: int = 7) -> dict

Call the Open-Meteo API and return the raw JSON response.

Parameters:

Name Type Description Default
latitude float

Site coordinates.

required
longitude float

Site coordinates.

required
start_date str | None

ISO-8601 date strings (required for archive mode).

None
end_date str | None

ISO-8601 date strings (required for archive mode).

None
daily list[str] | None

Daily variables to request (e.g. ["precipitation_sum"]).

None
hourly list[str] | None

Hourly variables (e.g. ["river_discharge"]).

None
forecast_days int

Number of forecast days (only for mode="forecast").

7

normalise

normalise(raw: dict) -> Sequence[BaseModel]

Convert Open-Meteo JSON into unified WaterQualitySample records.

Weather / climate variables are stored as WaterQualitySample with parameter set to the variable name (e.g. precipitation_sum).

SDG6Collector

SDG6Collector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect SDG 6 indicator data per country/year from the UN-Stats API.

fetch_raw

fetch_raw(indicator_codes: list[str] | None = None, country_codes: str | None = None, page_size: int = 200, **kwargs) -> list[dict]

Fetch SDG 6 indicator records.

Parameters:

Name Type Description Default
indicator_codes list[str] | None

e.g. ["6.4.2", "6.4.1"]. Defaults to all 9 SDG 6 indicators.

None
country_codes str | None

Comma-separated ISO3 or M49 numeric codes, e.g. "USA,DEU,IND". Omit to return data for all countries.

None
page_size int

Records per API page (max 5000 per UN docs).

200

TaiwanCivilIoTCollector

TaiwanCivilIoTCollector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect real-time water resource data from Taiwan's Civil IoT SensorThings API.

The entity parameter is consumed by :meth:fetch_raw, not by __init__; see that method's docstring for valid values.

fetch_raw

fetch_raw(entity: str = 'Datastreams', top: int = 100, expand: str | None = None, start_date: str | None = None, end_date: str | None = None, **kwargs) -> list[dict]

Fetch SensorThings entities.

Parameters:

Name Type Description Default
entity str

"Things" | "Datastreams" | "Observations"

'Datastreams'
top int

Max items per page.

100
expand str

OData $expand clause. If not given, one is built from start_date / end_date.

None
start_date str

ISO date string "YYYY-MM-DD" — filters phenomenonTime ge.

None
end_date str

ISO date string "YYYY-MM-DD" — filters phenomenonTime le.

None

normalise

normalise(raw: list[dict]) -> Sequence[WaterQualitySample]

Normalise SensorThings Datastreams with their latest Observation.

TaiwanDataGovCollector

TaiwanDataGovCollector(dataset_id: str = DATASET_WATER_LEVEL, client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect real-time water level data from Taiwan's open government data platform (data.gov.tw).

Parameters:

Name Type Description Default
dataset_id str

Dataset identifier. Use "25768" (default) for real-time river water level or "161082" for real-time groundwater level.

DATASET_WATER_LEVEL

fetch_raw

fetch_raw(limit: int = 1000, offset: int = 0, **kwargs) -> list[dict]

Page through the data.gov.tw API for the configured dataset.

Parameters:

Name Type Description Default
limit int

Records per page (max 1000).

1000
offset int

Starting record offset.

0

TaiwanMOENVCollector

TaiwanMOENVCollector(api_key: str = '', dataset_id: str = RIVER_WQ_DATASET, client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect river water-quality monitoring data from Taiwan MOENV.

Parameters:

Name Type Description Default
api_key str

Free key obtained at https://data.moenv.gov.tw/en/apikey

''
dataset_id str

Dataset identifier (default: river water quality AQX_P_07).

RIVER_WQ_DATASET

fetch_raw

fetch_raw(limit: int = 1000, offset: int = 0, **kwargs) -> list[dict]

Page through MOENV open-data endpoint and return raw records.

TaiwanWRAReservoirCollector

TaiwanWRAReservoirCollector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect daily reservoir operation data from WRA.

TaiwanWRAWaterLevelCollector

TaiwanWRAWaterLevelCollector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect real-time water-level readings from WRA river stations. Updated every 10 minutes at source.

TaiwanWRAFhyCollector

TaiwanWRAFhyCollector(data_type: str = 'water', client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect real-time hydrological data from the WRA 防災資訊網 (Fhy) API.

Parameters:

Name Type Description Default
data_type str

One of "water" (water level), "rainfall", or "flow" (river discharge). Defaults to "water".

'water'

fetch_raw

fetch_raw(**kwargs) -> list[dict]

Fetch real-time data for the configured data type.

TaiwanWRAIoTCollector

TaiwanWRAIoTCollector(data_type: str = 'groundwater', client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect real-time hydrological data from the WRA IoT open-data API.

Parameters:

Name Type Description Default
data_type str

One of "groundwater" (地下水位) or "rainfall" (累積雨量). Defaults to "groundwater".

'groundwater'

fetch_raw

fetch_raw(**kwargs) -> list[dict]

Fetch real-time data for the configured data type.

Tries each candidate path in order with an Accept: application/json header. CachedHTTPClient.get_json strips any BOM / leading whitespace and checks Content-Type before parsing, so non-JSON bodies surface as ValueError rather than an opaque JSONDecodeError.

On failure the first 200 chars of the response body are logged to help diagnose whether the server returned HTML, XML, or malformed JSON.

USGSCollector

USGSCollector(api_key: str = 'DEMO_KEY', client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect daily-value water data from USGS via OGC API.

Parameters:

Name Type Description Default
api_key str

Optional USGS API key for higher rate limits (get one at https://api.waterdata.usgs.gov/docs/ogcapi/#api-keys).

'DEMO_KEY'

fetch_raw

fetch_raw(collection: str = 'daily', datetime_range: str | None = None, days: int | None = None, limit: int = 10000, bbox: str | None = None, max_items: int | None = 2000, **kwargs) -> list[dict]

Fetch features from a USGS OGC collection.

Parameters:

Name Type Description Default
collection str

"daily" | "sta" | "discrete"

'daily'
datetime_range str

Explicit ISO 8601 interval "<start>/<end>" (USGS does NOT accept ISO durations like P7D). If omitted, an interval is built from days.

None
days int

Last N days from now (UTC). Defaults to 30 when datetime_range is not supplied.

None
limit int

Max features per page. Larger values mean fewer round-trips.

10000
bbox str

Bounding box filter "minLon,minLat,maxLon,maxLat" (WGS84). Without this the API returns data for every US monitoring location, which can require hundreds of paginated requests.

None
max_items int

Hard cap on total records fetched (across all pages). Keeps response times predictable. None means no cap.

2000

WaPORCollector

WaPORCollector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect satellite-based ET data from FAO WaPOR v3.

Parameters:

Name Type Description Default
client CachedHTTPClient | None

HTTP client instance. A default is created if None.

None
References

FAO. (2024). WaPOR v3. https://www.fao.org/in-action/remote-sensing-for-water-productivity/

fetch_raw

fetch_raw(bbox: tuple[float, float, float, float] | None = None, start_date: str = '2020-01-01', end_date: str = '2020-12-31', variable: str = 'RET', **kwargs: Any) -> list[dict]

Fetch raw WaPOR raster catalogue/summary data.

Parameters:

Name Type Description Default
bbox tuple[float, float, float, float] | None

Bounding box as (west, south, east, north) in decimal degrees.

None
start_date str

ISO date string for start of period.

'2020-01-01'
end_date str

ISO date string for end of period.

'2020-12-31'
variable str

WaPOR variable code ('AETI', 'NPP', 'RET').

'RET'

Returns:

Type Description
list[dict]

Raw API response records.

normalise

normalise(raw: list[dict]) -> Sequence[WaPORObservation]

Convert raw WaPOR response into WaPORObservation records.

Parameters:

Name Type Description Default
raw list[dict]

Records from fetch_raw.

required

Returns:

Type Description
Sequence[WaPORObservation]

Normalised WaPOR observations.

WQPCollector

WQPCollector(client: CachedHTTPClient | None = None)

Bases: BaseCollector

Collect discrete water quality data from the US Water Quality Portal.

Supports filtering by state, county, characteristic (parameter), date range, and bounding box.

fetch_raw

fetch_raw(state_code: str | None = None, characteristic_name: str | None = None, start_date: str | None = None, end_date: str | None = None, bbox: str | None = None, max_results: int = 1000, **kwargs) -> list[dict]

Fetch water quality results from WQP.

Parameters:

Name Type Description Default
state_code str | None

e.g. "US:06" for California.

None
characteristic_name str | None

e.g. "Dissolved oxygen (DO)", "pH"

None
start_date str | None

"MM-DD-YYYY" format.

None
end_date str | None

"MM-DD-YYYY" format.

None
bbox str | None

Bounding box: "west,south,east,north" in decimal degrees.

None
max_results int

Limit number of results (WQP default returns CSV).

1000

AI engine: methodology recommender

aquascope.ai_engine

AI-powered research methodology recommendation engine.

AgentResult dataclass

AgentResult(challenge_spec: ChallengeSpec, model_used: str = '', forecast: DataFrame | None = None, risk_assessment: dict | None = None, status: dict | None = None, anomalies: DataFrame | None = None, explanation: str = '', steps: list[str] = list())

Result of an end-to-end agent run.

HydroAgent

HydroAgent(default_model: str | None = None)

Orchestrator that parses a natural-language query and executes the corresponding challenge.

Parameters:

Name Type Description Default
default_model str | None

Override the auto-recommended model. If None, the ModelRecommender picks the top model for the challenge type.

None
Example

agent = HydroAgent() result = agent.solve("Drought monitoring near Sahel at lat 15, lon 0") print(result.explanation)

solve

solve(query: str, data: DataFrame | None = None, extra_data: dict[str, DataFrame] | None = None) -> AgentResult

Parse query, load data, pick model, and run the challenge.

Parameters:

Name Type Description Default
query str

Natural-language challenge description.

required
data DataFrame | None

Primary data (discharge / precipitation). If None and coordinates are detected, the agent tries to fetch data from Open-Meteo.

None
extra_data dict[str, DataFrame] | None

Additional named DataFrames (e.g. {"et": et_df}).

None

Returns:

Type Description
AgentResult

explain

explain(result: AgentResult) -> str

Produce a human-readable summary of an AgentResult.

ResearchMethodology dataclass

ResearchMethodology(id: str, name: str, category: str, description: str, applicable_parameters: list[str] = list(), data_requirements: list[str] = list(), typical_scale: str = '', complexity: str = '', references: list[str] = list(), tags: list[str] = list())

Describes a single research methodology applicable to water studies.

ModelRecommendation dataclass

ModelRecommendation(model_id: str, rank: int, challenge_type: str, task_type: str, rationale: str)

A single model recommendation.

ModelRecommender

Recommend predictive models for a given challenge type and task.

Example

rec = ModelRecommender() picks = rec.recommend("flood", "forecast") picks[0].model_id 'prophet'

recommend

recommend(challenge_type: str, task_type: str = 'forecast', top_k: int = 3) -> list[ModelRecommendation]

Return ranked model recommendations.

Parameters:

Name Type Description Default
challenge_type str

One of flood, drought, water_quality.

required
task_type str

Task within the challenge: forecast, anomaly, analysis, index.

'forecast'
top_k int

Maximum number of recommendations to return.

3

Returns:

Type Description
list[ModelRecommendation]

ChallengePlanner

Parse natural-language descriptions into structured challenge specs.

Example

planner = ChallengePlanner() spec = planner.parse("Forecast flooding on the Niger River at lat 13.5, lon 2.1") spec.challenge_type 'flood'

parse

parse(query: str) -> ChallengeSpec

Parse a natural-language query into a ChallengeSpec.

Parameters:

Name Type Description Default
query str

Free-text description of the challenge.

required

Returns:

Type Description
ChallengeSpec

Structured challenge with type, variables, and location.

ChallengeSpec dataclass

ChallengeSpec(challenge_type: str = 'unknown', variables: list[str] = list(), latitude: float | None = None, longitude: float | None = None, location_name: str | None = None, forecast_days: int = 7, raw_query: str = '', confidence: float = 0.0)

Structured representation of a parsed challenge request.

DatasetProfile dataclass

DatasetProfile(parameters: list[str] = list(), n_records: int = 0, n_stations: int = 0, time_span_years: float = 0.0, geographic_scope: str = '', data_sources: list[str] = list(), research_goal: str = '', keywords: list[str] = list())

Summary of a collected dataset, used as input to the recommender.

Recommendation dataclass

Recommendation(methodology: ResearchMethodology, score: float, rationale: str = '')

A single methodology recommendation with a relevance score.

get_methodology

get_methodology(method_id: str) -> ResearchMethodology | None

Look up a methodology by ID.

search_methodologies

search_methodologies(parameters: list[str] | None = None, category: str | None = None, tags: list[str] | None = None, scale: str | None = None) -> list[ResearchMethodology]

Filter the catalogue by parameters, category, tags, or scale.

recommend

recommend(profile: DatasetProfile, top_k: int = 5, min_score: float = 20.0) -> list[Recommendation]

Return the top-k methodology recommendations for the given dataset profile.

Parameters:

Name Type Description Default
profile DatasetProfile
required
top_k int
5
min_score float

Minimum relevance score to include.

20.0

Returns:

Type Description
list[Recommendation] sorted by descending score.

recommend_with_llm

recommend_with_llm(profile: DatasetProfile, top_k: int = 5, model: str = 'gpt-4o-mini', api_key: str | None = None, base_url: str | None = None, timeout: float = 120.0) -> list[Recommendation]

Use an LLM to provide more nuanced methodology recommendations.

Falls back to rule-based if the LLM call fails.

Supported providers (detected from base_url): - OpenAI (base_url=None or api.openai.com) - HuggingFace Inference API (api-inference.huggingface.co) — free tier - Groq (api.groq.com) — free tier - Ollama local (localhost) — uses native /api/chat to avoid openai-client quirks


Hydrological analysis

Flood-frequency analysis, baseflow separation, rating curves, signatures.

aquascope.hydrology

AquaScope hydrology module.

Standard hydrological analysis tools:

  • Flow duration curves and low-flow statistics (Q95, 7Q10, 30Q5)
  • Baseflow separation (Lyne–Hollick, Eckhardt digital filters)
  • Recession analysis (segment identification + MRC fitting)
  • Flood frequency analysis (GEV, Log-Pearson Type III)
  • Stage-discharge rating curves (power-law fit, segmented curves, shift detection)

Quick start::

from aquascope.hydrology import flow_duration_curve, lyne_hollick, recession_analysis, fit_gev

fdc = flow_duration_curve(discharge_series)
print(f"Q95 = {fdc.percentiles[95]:.2f} m³/s")

bf = lyne_hollick(discharge_series)
print(f"BFI = {bf.bfi:.2f}")

rec = recession_analysis(discharge_series)
print(f"Recession constant = {rec.recession_constant:.1f} days")

ffa = fit_gev(discharge_series)
print(f"100-year flood = {ffa.return_periods[100]:.1f} m³/s")

BaseflowResult dataclass

BaseflowResult(df: DataFrame, bfi: float, method: str)

Result of baseflow separation.

Attributes:

Name Type Description
df DataFrame

DataFrame with total, baseflow, quickflow columns and the original DatetimeIndex.

bfi float

Baseflow Index — ratio of total baseflow to total discharge.

method str

Name of the filter used.

EMAResult dataclass

EMAResult(return_periods: dict[int, float] = dict(), distribution: str = '', params: tuple = (), annual_max: Series | None = None, confidence_intervals: dict[int, tuple[float, float]] = dict(), n_censored: int = 0, n_observed: int = 0, weighted_skew: float | None = None, low_outlier_threshold: float | None = None)

Bases: FloodFreqResult

Result of Expected Moments Algorithm flood frequency analysis.

Extends :class:FloodFreqResult with censoring and EMA-specific fields.

Attributes: n_censored: Number of censored (zero-flow / low-outlier) observations. n_observed: Number of non-censored observations. weighted_skew: Weighted skew coefficient (station + regional). None when no regional skew was supplied. low_outlier_threshold: MGB low-outlier threshold (real-space). None when no outliers were detected.

FloodFreqResult dataclass

FloodFreqResult(return_periods: dict[int, float] = dict(), distribution: str = '', params: tuple = (), annual_max: Series | None = None, confidence_intervals: dict[int, tuple[float, float]] = dict())

Result of flood frequency analysis.

Attributes:

Name Type Description
return_periods dict[int, float]

Mapping of return period (years) → estimated discharge.

distribution str

Name of the fitted distribution.

params tuple

Distribution parameters (shape, loc, scale).

annual_max Series | None

The annual maximum series used for fitting.

confidence_intervals dict[int, tuple[float, float]]

Optional mapping of return period → (lower, upper) 90 % CI.

GoodnessOfFitResult dataclass

GoodnessOfFitResult(statistic: float = 0.0, p_value: float = 1.0, test_name: str = '', distribution: str = '', reject_h0: bool = False)

Result of a goodness-of-fit test.

Attributes: statistic: Test statistic value. p_value: Associated p-value. test_name: Name of the test. distribution: Distribution that was tested. reject_h0: True if H₀ (data follows distribution) is rejected at α = 0.05.

NonStationaryGEVResult dataclass

NonStationaryGEVResult(loc_intercept: float = 0.0, loc_trend: float = 0.0, scale: float = 1.0, shape: float = 0.0, return_levels: dict[float, ndarray] = dict(), years: ndarray = (lambda: np.array([]))(), aic: float = 0.0, bic: float = 0.0, trend_significant: bool = False)

Result of non-stationary GEV fit.

Attributes: loc_intercept: Intercept of the linear location model. loc_trend: Trend in the location parameter (per year). scale: Scale parameter. shape: Shape parameter (scipy sign convention). return_levels: Mapping of return period → array of return levels over time. years: The year values used for fitting. aic: Akaike information criterion. bic: Bayesian information criterion. trend_significant: True if the trend p-value < 0.05 (likelihood-ratio test).

RegionalResult dataclass

RegionalResult(growth_curve: dict[float, float] = dict(), index_flood: dict[str, float] = dict(), regional_return_levels: dict[str, dict[float, float]] = dict(), discordancy: dict[str, float] = dict(), heterogeneity: float = 0.0)

Result of regional frequency analysis.

Attributes: growth_curve: Mapping of return period → growth factor. index_flood: Mapping of site id → index flood (mean annual max). regional_return_levels: Mapping of site id → {return period → level}. discordancy: Mapping of site id → discordancy statistic Dᵢ. heterogeneity: Heterogeneity H statistic.

FDCResult dataclass

FDCResult(exceedance: ndarray, discharge: ndarray, percentiles: dict[float, float] = dict())

Result of a flow duration curve analysis.

Attributes:

Name Type Description
exceedance ndarray

Exceedance probability array (0–100 %).

discharge ndarray

Sorted discharge values (descending).

percentiles dict[float, float]

Mapping of exceedance % → discharge value (e.g. {95: 1.23}).

RatingCurveResult dataclass

RatingCurveResult(a: float, b: float, h0: float, r_squared: float, rmse: float, n_points: int, residuals: ndarray, stage_range: tuple[float, float], segments: list[RatingSegment] | None = None)

Result of rating curve fitting.

Attributes: a: Power-law coefficient. b: Power-law exponent. h0: Stage offset (datum correction). r_squared: Coefficient of determination. rmse: Root mean squared error. n_points: Number of stage-discharge pairs used. residuals: Array of residuals (observed - predicted). stage_range: Tuple of (min_stage, max_stage). segments: Segment parameters for segmented curves, or None.

RatingSegment dataclass

RatingSegment(stage_min: float, stage_max: float, a: float, b: float, h0: float, r_squared: float)

A segment of a segmented rating curve.

Attributes: stage_min: Lower bound of the segment stage range. stage_max: Upper bound of the segment stage range. a: Power-law coefficient. b: Power-law exponent. h0: Stage offset (datum correction). r_squared: Coefficient of determination for the segment.

RecessionResult dataclass

RecessionResult(segments: list[RecessionSegment] = list(), recession_constant: float = 0.0, r_squared: float = 0.0, half_life_days: float = 0.0)

Result of recession analysis.

Attributes:

Name Type Description
segments list[RecessionSegment]

List of identified recession segments.

recession_constant float

Fitted exponential decay constant k in Q(t) = Q₀·e^(−t/k).

r_squared float

R² of the master recession curve fit.

half_life_days float

Time in days for discharge to halve (k·ln(2)).

RecessionSegment dataclass

RecessionSegment(start: Timestamp, end: Timestamp, discharge: ndarray)

A single recession segment.

Attributes:

Name Type Description
start Timestamp

Start timestamp.

end Timestamp

End timestamp.

discharge ndarray

Discharge values during the recession.

SignatureReport dataclass

SignatureReport(mean_flow: float, median_flow: float, q5: float, q95: float, q5_q95_ratio: float, cv: float, iqr: float, high_flow_frequency: float, high_flow_duration: float, q_peak_mean: float, low_flow_frequency: float, low_flow_duration: float, baseflow_index: float, zero_flow_fraction: float, peak_month: int, seasonality_index: float, rising_limb_density: float, flashiness_index: float, mean_recession_constant: float, runoff_ratio: float | None, elasticity: float | None)

Complete set of hydrological signatures for a streamflow record.

Attributes are grouped by the aspect of the flow regime they describe. Fields set to None require optional inputs (e.g. precipitation).

eckhardt

eckhardt(discharge: Series, *, alpha: float = 0.98, bfi_max: float = 0.8) -> BaseflowResult

Separate baseflow using the Eckhardt two-parameter digital filter.

Parameters:

Name Type Description Default
discharge Series

Daily discharge series with a DatetimeIndex.

required
alpha float

Recession constant (typically 0.95–0.99).

0.98
bfi_max float

Maximum baseflow index — depends on aquifer type: - 0.80 for perennial streams with porous aquifers - 0.50 for ephemeral streams with porous aquifers - 0.25 for perennial streams with hard-rock aquifers

0.8

Returns:

Name Type Description
A class:`BaseflowResult` with separated components and BFI.

lyne_hollick

lyne_hollick(discharge: Series, *, alpha: float = 0.925, n_passes: int = 3) -> BaseflowResult

Separate baseflow using the Lyne–Hollick recursive digital filter.

Parameters:

Name Type Description Default
discharge Series

Daily discharge series with a DatetimeIndex.

required
alpha float

Filter parameter (0 < α < 1). Higher values yield less baseflow. Default 0.925 is the value recommended by Nathan & McMahon (1990).

0.925
n_passes int

Number of forward/backward filter passes (typically 3).

3

Returns:

Name Type Description
A class:`BaseflowResult` with separated components and BFI.

anderson_darling_test

anderson_darling_test(data: ndarray, distribution: str, params: tuple) -> GoodnessOfFitResult

Anderson-Darling goodness-of-fit test for a fitted distribution.

Parameters: data: Observed sample. distribution: Distribution name (e.g. "gev", "gumbel"). params: Distribution parameters as accepted by the scipy distribution.

Returns: A :class:GoodnessOfFitResult.

coverage_probability

coverage_probability(discharge: Series, distribution: str = 'gev', ci_level: float = 0.9, n_splits: int = 10, n_boot: int = 200) -> float

Estimate the coverage probability of confidence intervals.

Split data into n_splits folds, compute bootstrap CIs on each training set, and check what fraction of test observations fall within those CIs.

Parameters:

Name Type Description Default
discharge Series

Annual maximum discharge series.

required
distribution str

Distribution key.

'gev'
ci_level float

Nominal confidence level (default 0.90).

0.9
n_splits int

Number of cross-validation folds (default 10).

10
n_boot int

Number of bootstrap samples per fold (default 200).

200

Returns:

Type Description
float

Observed coverage probability (0–1).

cramer_von_mises_test

cramer_von_mises_test(data: ndarray, distribution: str, params: tuple) -> GoodnessOfFitResult

Cramér–von Mises goodness-of-fit test.

Parameters: data: Observed sample. distribution: Distribution name. params: Distribution parameters.

Returns: A :class:GoodnessOfFitResult.

expected_moments_algorithm

expected_moments_algorithm(annual_max: Series | ndarray, *, perception_thresholds: list[tuple[float, float]] | None = None, zero_threshold: float = 0.0, regional_skew: float | None = None, regional_skew_mse: float = 0.302, return_periods: list[int] | None = None) -> EMAResult

Expected Moments Algorithm (EMA) for LP3 flood frequency analysis.

Implements the EMA procedure of Cohn et al. (1997) for incorporating censored observations (zero-flow years, low outliers, historical floods) into the LP3 parameter estimation. This is the preferred method in USGS Bulletin 17C (England et al., 2018).

Parameters: annual_max: Annual maximum series. May contain zero / negative values which will be treated as censored. perception_thresholds: Optional list of (lower, upper) pairs defining the perception interval for each observation. When None the algorithm automatically treats observations ≤ zero_threshold as left-censored and the remainder as exactly observed. zero_threshold: Values ≤ this are treated as censored (default 0.0). regional_skew: Regional / generalised skew for weighted skew. regional_skew_mse: MSE of the regional skew estimate. return_periods: Return periods to estimate. Defaults to standard set.

Returns: An :class:EMAResult with LP3 quantile estimates, confidence intervals, and censoring metadata.

Raises: ValueError: If fewer than 5 total observations.

References: England, J. F. Jr., Cohn, T. A., Faber, B. A., Stedinger, J. R., Thomas, W. O. Jr., Veilleux, A. G., Kiang, J. E., & Mason, R. R. Jr. (2018). Guidelines for determining flood flow frequency — Bulletin 17C. USGS TM 4-B5. https://doi.org/10.3133/tm4B5

Cohn, T. A., Lane, W. L., & Baier, W. G. (1997). An algorithm for
computing moments-based flood quantile estimates when historical
flood information is available. Water Resources Research, 33(9),
2089-2096. https://doi.org/10.1029/96WR03706

fit_gev

fit_gev(discharge: Series, *, return_periods: list[int] | None = None, ci_level: float = 0.9) -> FloodFreqResult

Fit a GEV distribution to the annual maximum series.

Parameters:

Name Type Description Default
discharge Series

Daily discharge series with a DatetimeIndex.

required
return_periods list[int] | None

Return periods in years to estimate. Defaults to [2, 5, 10, 25, 50, 100, 200, 500].

None
ci_level float

Confidence level for bootstrap CIs (default 0.90).

0.9

Returns:

Name Type Description
A class:`FloodFreqResult` with quantile estimates.

Raises:

Type Description
ValueError

If fewer than 5 annual maxima are available.

fit_gev_lmoments

fit_gev_lmoments(annual_maxima: ndarray | Series, return_periods: list[float] | None = None) -> FloodFreqResult

Fit GEV distribution using L-moments method.

More robust than MLE for small samples (n < 50). The shape parameter k is estimated from L-skewness using the Hosking (1997) approximation:

c = 2 / (3 + t3) − ln2 / ln3
k ≈ 7.8590 c + 2.9554 c²

Parameters: annual_maxima: Array of annual maximum values. return_periods: Return periods in years. Defaults to standard set.

Returns: A :class:FloodFreqResult with fitted parameters and return levels.

Raises: ValueError: If fewer than 5 values are provided.

fit_gpd

fit_gpd(exceedances: ndarray | Series, threshold: float, return_periods: list[float] | None = None, total_observations: int | None = None) -> FloodFreqResult

Fit Generalised Pareto Distribution using Peaks-Over-Threshold method.

Parameters: exceedances: Values above the threshold (already filtered). threshold: The threshold used for POT selection. return_periods: Return periods in years. total_observations: Total number of observations used to compute exceedance rate. If None, assumed equal to len(exceedances).

Returns: A :class:FloodFreqResult with fitted parameters and return levels.

Raises: ValueError: If fewer than 10 exceedances are provided.

fit_gumbel

fit_gumbel(annual_maxima: ndarray | Series, return_periods: list[float] | None = None) -> FloodFreqResult

Fit Gumbel (Type I) extreme value distribution.

Special case of GEV with shape=0. Uses scipy.stats.gumbel_r (MLE).

Gumbel CDF: F(x) = exp(-exp(-(x - loc) / scale))

Parameters: annual_maxima: Array of annual maximum values. return_periods: Return periods in years. Defaults to standard set.

Returns: A :class:FloodFreqResult with fitted parameters and return levels.

Raises: ValueError: If fewer than 5 values are provided.

fit_lp3

fit_lp3(discharge: Series, *, return_periods: list[int] | None = None, regional_skew: float | None = None, regional_skew_mse: float = 0.302, ci_level: float = 0.9, zero_threshold: float = 0.0) -> FloodFreqResult

Fit a Log-Pearson Type III distribution (Bulletin 17C approach).

When regional_skew is provided the station skew is adjusted using the inverse-variance weighted average described in Bulletin 17C §5.2.4 (England et al., 2018). Confidence intervals are computed via the variance-of-estimate approach (Bulletin 17C §6).

Parameters: discharge: Daily discharge series with a DatetimeIndex. return_periods: Return periods to estimate. Defaults to standard set. regional_skew: Generalised / regional skew coefficient. When None (default) the station skew is used unmodified for backward compatibility. regional_skew_mse: Mean-square error of the regional skew estimate. Default 0.302 is the USGS nationwide value. ci_level: Confidence level for return-period intervals (0 < ci < 1). zero_threshold: Values ≤ this are excluded before fitting.

Returns: A :class:FloodFreqResult with quantile estimates and optional CIs.

Raises: ValueError: If fewer than 5 annual maxima or all values ≤ zero_threshold.

References: England, J. F. Jr., Cohn, T. A., Faber, B. A., Stedinger, J. R., Thomas, W. O. Jr., Veilleux, A. G., Kiang, J. E., & Mason, R. R. Jr. (2018). Guidelines for determining flood flow frequency — Bulletin 17C. U.S. Geological Survey Techniques and Methods 4-B5. https://doi.org/10.3133/tm4B5

fit_nonstationary_gev

fit_nonstationary_gev(annual_maxima: ndarray | Series, years: ndarray | Series, return_periods: list[float] | None = None) -> NonStationaryGEVResult

Fit GEV with time-varying location: loc(t) = mu0 + mu1 * (t − t̄).

The trend significance is assessed via a likelihood-ratio test comparing the non-stationary model to a stationary GEV.

Parameters: annual_maxima: Array of annual maximum values. years: Corresponding year values (same length as annual_maxima). return_periods: Return periods in years. Defaults to standard set.

Returns: A :class:NonStationaryGEVResult.

Raises: ValueError: If input arrays differ in length or have fewer than 10 values.

fit_weibull_min

fit_weibull_min(annual_minima: ndarray | Series, return_periods: list[float] | None = None) -> FloodFreqResult

Fit Weibull distribution to annual minima for low-flow frequency analysis.

Uses scipy.stats.weibull_min (MLE). Return periods relate to the probability of flows below a given level.

Parameters: annual_minima: Array of annual minimum values. return_periods: Return periods in years. Defaults to standard set.

Returns: A :class:FloodFreqResult with fitted parameters and return levels.

Raises: ValueError: If fewer than 5 values are provided.

grubbs_beck_test

grubbs_beck_test(annual_max: ndarray, *, alpha: float = 0.1) -> tuple[float, np.ndarray]

Multiple Grubbs-Beck (MGB) test for low-outlier detection.

Implements the iterative procedure described in Bulletin 17C Appendix 6 (Cohn et al., 2013). The test repeatedly identifies the smallest observation that is significantly low relative to the remaining sample.

Parameters: annual_max: 1-D array of annual maximum values (all positive). alpha: Significance level for the test.

Returns: A 2-tuple (threshold, mask) where threshold is the low-outlier cutoff (log10 scale back-transformed) and mask is a boolean array with True for observations identified as low outliers.

References: Cohn, T. A., England, J. F. Jr., Berenbrock, C. E., Mason, R. R., Stedinger, J. R., & Lamontagne, J. R. (2013). A generalized Grubbs-Beck test statistic for detecting multiple potentially influential low outliers in flood series. Water Resources Research, 49(8), 5047-5058.

Grubbs, F. E. & Beck, G. (1972). Extension of sample sizes and
percentage points for significance tests of outlying observations.
Technometrics, 14(4), 847-854.

leave_one_out_cv

leave_one_out_cv(discharge: Series, distribution: str = 'gev', return_periods: list[int] | None = None) -> dict

Leave-one-out cross-validation for flood frequency fits.

For each held-out year, the distribution is fitted on the remaining years and the held-out observation is compared against the fitted median (T = 2-year return level).

Parameters:

Name Type Description Default
discharge Series

Annual maximum discharge series (with a DatetimeIndex).

required
distribution str

Distribution key: "gev", "lp3", "gumbel", "gpd", "weibull".

'gev'
return_periods list[int]

Return periods used internally. Defaults to [2] (median).

None

Returns:

Type Description
dict

Keys: 'rmse', 'bias', 'mae', 'predictions', 'observations'.

lmoments_from_sample

lmoments_from_sample(data: ndarray) -> dict[str, float]

Compute L-moments (L1–L4) and L-moment ratios (t3, t4) from a sample.

L-moments are linear combinations of probability weighted moments (PWMs).

Parameters: data: 1-D array of observations.

Returns: Dictionary with keys L1, L2, L3, L4, t3, t4.

Raises: ValueError: If fewer than 4 observations are provided.

probability_plot_correlation

probability_plot_correlation(data: ndarray, distribution: str, params: tuple) -> float

Probability Plot Correlation Coefficient (PPCC).

Computes the Pearson correlation between the sorted observations and the corresponding theoretical quantiles. Values close to 1 indicate a good fit.

Parameters: data: Observed sample. distribution: Distribution name. params: Distribution parameters.

Returns: PPCC value (between 0 and 1 for reasonable fits).

regional_frequency_analysis

regional_frequency_analysis(sites: dict[str, ndarray], return_periods: list[float] | None = None) -> RegionalResult

L-moment based regional frequency analysis (Hosking & Wallis method).

Steps: 1. Compute L-moments for each site. 2. Discordancy test (flag sites with unusual L-moments). 3. Heterogeneity measure (H < 1 → acceptably homogeneous). 4. Fit regional growth curve using weighted regional L-moments. 5. Combine with site-specific index flood for return levels.

Parameters: sites: Mapping of site identifier → annual maxima array. return_periods: Return periods in years. Defaults to standard set.

Returns: A :class:RegionalResult.

Raises: ValueError: If fewer than 2 sites are provided.

select_pot_threshold

select_pot_threshold(data: ndarray | Series, method: str = 'mean_residual') -> float

Select optimal threshold for Peaks-Over-Threshold analysis.

Parameters: data: Array of observations. method: Selection method — "mean_residual", "percentile" (95th percentile), or "sqrt_rule" (mean + 1.5 × std).

Returns: Optimal threshold value.

Raises: ValueError: If method is not recognised.

weighted_skew

weighted_skew(station_skew: float, regional_skew: float, n: int, regional_mse: float = 0.302) -> float

Compute weighted skew per Bulletin 17C §5.2.4.

Combines the station skew Gs with a generalised / regional skew Gr using inverse-variance weights:

.. math::

G_w = w_1 G_s + w_2 G_r

where w_1 = MSE_r / (MSE_s + MSE_r) and w_2 = 1 − w_1.

Parameters: station_skew: Station skew coefficient Gs. regional_skew: Regional / generalised skew Gr. n: Number of annual maximum observations. regional_mse: Mean-square error of the regional skew estimate. Default 0.302 is the USGS nationwide value from Bulletin 17C.

Returns: Weighted skew coefficient Gw.

References: England, J. F. Jr. et al. (2018). Guidelines for determining flood flow frequency — Bulletin 17C. USGS TM 4-B5. https://doi.org/10.3133/tm4B5

flow_duration_curve

flow_duration_curve(discharge: Series, *, percentiles: list[float] | None = None) -> FDCResult

Compute a flow duration curve.

Parameters:

Name Type Description Default
discharge Series

Time-series of discharge values (any DatetimeIndex).

required
percentiles list[float] | None

Exceedance percentiles to extract. Defaults to [5, 10, 25, 50, 75, 90, 95, 99].

None

Returns:

Name Type Description
A class:`FDCResult` containing sorted discharges and extracted
percentile values.

low_flow_stat

low_flow_stat(discharge: Series, *, n_day: int = 7, return_period: int = 10) -> float

Compute nQm low-flow statistic (e.g. 7Q10).

The nQm is the minimum n-day rolling average that occurs with a return period of m years, estimated using the Weibull plotting position.

Parameters:

Name Type Description Default
discharge Series

Daily discharge series with a DatetimeIndex.

required
n_day int

Rolling window size in days.

7
return_period int

Return period in years.

10

Returns:

Type Description
The estimated nQm value in the same units as the input discharge.

Raises:

Type Description
ValueError

If there are fewer than 3 complete water years.

cross_validate_rating

cross_validate_rating(stage: ndarray, discharge: ndarray, k_folds: int = 5) -> dict

K-fold cross-validation of a rating curve fit.

Parameters: stage: Stage measurements. discharge: Discharge measurements. k_folds: Number of folds (default 5).

Returns: Dict with 'mean_rmse', 'std_rmse', 'mean_r2', and 'fold_results' (list of per-fold dicts).

Raises: ValueError: If inputs are invalid.

detect_rating_shift

detect_rating_shift(stage: ndarray, discharge: ndarray, timestamps: ndarray | DatetimeIndex, window_size: int = 20) -> list[dict]

Detect temporal shifts in the stage-discharge relationship.

Fits rolling-window rating curves and compares the residual variance of successive windows using a chi-squared test. Significant changes are flagged as rating shifts.

Parameters: stage: Stage measurements. discharge: Corresponding discharge measurements. timestamps: Observation timestamps. window_size: Number of observations per rolling window.

Returns: List of dicts, each containing 'timestamp', 'shift_magnitude', and 'p_value'.

Raises: ValueError: If inputs are invalid.

export_hec_ras

export_hec_ras(result: RatingCurveResult, filepath: str | Path) -> None

Export a rating curve to HEC-RAS compatible format.

Writes a simple table of stage-discharge pairs at regular intervals spanning the fitted stage range.

Parameters: result: Fitted :class:RatingCurveResult. filepath: Output file path.

fit_rating_curve

fit_rating_curve(stage: ndarray, discharge: ndarray, h0: float | None = None) -> RatingCurveResult

Fit a power-law rating curve Q = a * (H - H₀)^b.

If h0 is None, it is optimised together with a and b using :func:scipy.optimize.curve_fit.

Parameters: stage: Water level (stage) measurements. discharge: Corresponding discharge measurements. h0: Optional fixed stage offset. If None, estimated from data.

Returns: :class:RatingCurveResult with fitted parameters and diagnostics.

Raises: ValueError: If fewer than 5 stage-discharge pairs are provided, discharge contains negative values, or arrays contain NaN.

fit_segmented_rating_curve

fit_segmented_rating_curve(stage: ndarray, discharge: ndarray, n_segments: int = 2, breakpoints: list[float] | None = None) -> RatingCurveResult

Fit a multi-segment rating curve with breakpoints.

Each segment is fitted independently as a power-law. If breakpoints are not provided, optimal breakpoints are found by minimising total RMSE over a grid of candidate values.

Parameters: stage: Water level (stage) measurements. discharge: Corresponding discharge measurements. n_segments: Number of segments (default 2). breakpoints: Explicit breakpoint stage values. Length must equal n_segments - 1. If None, breakpoints are optimised.

Returns: :class:RatingCurveResult with per-segment parameters in segments.

Raises: ValueError: If inputs are invalid or segments cannot be fitted.

predict_discharge

predict_discharge(result: RatingCurveResult, stage: ndarray | Series) -> np.ndarray

Predict discharge from stage using a fitted rating curve.

For segmented curves the appropriate segment is selected for each stage value.

Parameters: result: Fitted :class:RatingCurveResult. stage: Stage values to predict at.

Returns: Predicted discharge array.

predict_stage

predict_stage(result: RatingCurveResult, discharge: ndarray | Series) -> np.ndarray

Inverse prediction: compute stage from discharge.

Solves H = (Q / a)^(1/b) + H₀ for the primary (or first) segment. For segmented curves the first segment whose discharge range contains the query value is used.

Parameters: result: Fitted :class:RatingCurveResult. discharge: Discharge values.

Returns: Predicted stage array.

rating_curve_uncertainty

rating_curve_uncertainty(result: RatingCurveResult, stage: ndarray, confidence: float = 0.95) -> tuple[np.ndarray, np.ndarray]

Compute prediction intervals for discharge estimates.

Uses a residual-based approach: the standard error of the residuals is scaled by the appropriate t-distribution quantile.

Parameters: result: Fitted :class:RatingCurveResult. stage: Stage values at which to compute intervals. confidence: Confidence level (default 0.95).

Returns: Tuple of (lower_bound, upper_bound) discharge arrays.

fit_master_recession

fit_master_recession(segments: list[RecessionSegment]) -> RecessionResult

Fit a master recession curve to the identified segments.

Uses least-squares fitting of ln(Q/Q₀) vs time to estimate the recession constant k in the exponential model Q(t) = Q₀·e^(−t/k).

Parameters:

Name Type Description Default
segments list[RecessionSegment]

Recession segments from :func:identify_recessions.

required

Returns:

Name Type Description
A class:`RecessionResult` with the fitted recession constant and
goodness-of-fit metrics.

Raises:

Type Description
ValueError

If no segments are provided.

identify_recessions

identify_recessions(discharge: Series, *, min_length: int = 5, min_decline_pct: float = 0.05) -> list[RecessionSegment]

Identify recession segments in a daily discharge series.

A recession is a continuous period where each day's discharge is less than the previous day's. Very short segments or those with negligible total decline are excluded.

Parameters:

Name Type Description Default
discharge Series

Daily discharge series with a DatetimeIndex.

required
min_length int

Minimum segment length in days.

5
min_decline_pct float

Minimum total decline as a fraction of the starting value.

0.05

Returns:

Type Description
List of :class:`RecessionSegment` instances.

recession_analysis

recession_analysis(discharge: Series, *, min_length: int = 5, min_decline_pct: float = 0.05) -> RecessionResult

Run full recession analysis: identify segments + fit MRC.

Convenience function combining :func:identify_recessions and :func:fit_master_recession.

Parameters:

Name Type Description Default
discharge Series

Daily discharge series with a DatetimeIndex.

required
min_length int

Minimum recession segment length in days.

5
min_decline_pct float

Minimum total decline fraction.

0.05

Returns:

Name Type Description
A class:`RecessionResult`.

baseflow_index_simple

baseflow_index_simple(discharge: Series) -> float

Quick BFI using the Lyne–Hollick 1-pass digital filter.

Uses alpha=0.925 and a single forward pass for speed. For a more robust estimate use :func:aquascope.hydrology.baseflow.lyne_hollick directly with multiple passes.

Parameters: discharge: Daily discharge time series with DatetimeIndex.

Returns: Baseflow index (0–1).

compare_signatures

compare_signatures(sig1: SignatureReport, sig2: SignatureReport) -> dict[str, float]

Compare two signature reports field-by-field.

Parameters: sig1: First :class:SignatureReport. sig2: Second :class:SignatureReport.

Returns: Dictionary of {field_name: absolute_percent_difference} for every numeric field. Fields that are None in either report are skipped.

compute_signatures

compute_signatures(discharge: Series, precipitation: Series | None = None, area_km2: float | None = None) -> SignatureReport

Compute comprehensive hydrological signatures from daily streamflow.

Parameters: discharge: Daily discharge time series (pd.Series with DatetimeIndex). Must contain at least 365 non-NaN values. precipitation: Optional daily precipitation series aligned with discharge. When provided, runoff ratio and elasticity are calculated. area_km2: Optional catchment area in km². Currently reserved for future unit-conversion but not required for any signature.

Returns: :class:SignatureReport with all computed signatures.

Raises: ValueError: If discharge has fewer than 365 non-NaN values.

flashiness_index

flashiness_index(discharge: Series) -> float

Richards-Baker Flashiness Index.

.. math:: FI = \frac{\sum |Q_i - Q_{i-1}|}{\sum Q_i}

Higher values indicate a more flashy/responsive catchment.

Parameters: discharge: Daily discharge time series with DatetimeIndex.

Returns: Flashiness index (dimensionless, ≥ 0).

flow_elasticity

flow_elasticity(discharge: Series, precipitation: Series) -> float

Sankarasubramanian precipitation-streamflow elasticity.

Computed year-by-year as:

.. math:: E = \text{median}\left(\frac{dQ / \bar{Q}}{dP / \bar{P}}\right)

where dQ and dP are annual departures from the long-term mean. An elasticity > 1 means streamflow is proportionally more variable than precipitation.

Parameters: discharge: Daily discharge time series with DatetimeIndex. precipitation: Daily precipitation with the same DatetimeIndex.

Returns: Elasticity coefficient (dimensionless).

Raises: ValueError: If fewer than 2 complete years are available.

recession_constant

recession_constant(discharge: Series, min_length: int = 5) -> float

Mean recession constant from all recession segments.

A recession segment is a run of consecutive days where discharge decreases. For each segment of at least min_length days the exponential decay rate k is estimated by linear regression of ln(Q) against time. The median k across all segments is returned.

Parameters: discharge: Daily discharge time series with DatetimeIndex. min_length: Minimum number of consecutive falling days to qualify as a recession segment.

Returns: Median recession constant k (day⁻¹, positive).

seasonality_index

seasonality_index(discharge: Series) -> tuple[float, int]

Markham's seasonality index and concentration month.

Monthly mean flows are treated as vectors with direction equal to the month's angular position on a unit circle. The resultant's magnitude (normalised) gives the seasonality index and its direction gives the peak month.

Parameters: discharge: Daily discharge time series with DatetimeIndex.

Returns: Tuple of (index, peak_month) where index ranges from 0 (uniform flow throughout the year) to 1 (all flow concentrated in a single month), and peak_month is 1–12.

similarity_score

similarity_score(sig1: SignatureReport, sig2: SignatureReport, weights: dict[str, float] | None = None) -> float

Compute overall similarity between two catchments.

The score is a weighted Euclidean distance in normalised signature space. A score of 0 means the reports are identical; higher values indicate greater dissimilarity.

Parameters: sig1: First :class:SignatureReport. sig2: Second :class:SignatureReport. weights: Optional mapping of field name → weight. Fields not in the dict receive a weight of 1.0. Defaults emphasise BFI, flashiness, seasonality, and runoff ratio.

Returns: Weighted Euclidean distance (≥ 0).


Agricultural water management

FAO-56 Penman–Monteith, crop water requirements, soil water balance.

aquascope.agri

Agricultural water management module.

Implements FAO-56 Penman-Monteith reference evapotranspiration, crop water requirements, and soil water balance modeling.

AgricultureBenchmarkResult dataclass

AgricultureBenchmarkResult(metric_id: str, metric_name: str, output_unit: str, summary: str, table: DataFrame)

Structured result from an AQUASTAT benchmarking workflow.

to_dict

to_dict() -> dict[str, object]

Return a JSON-serializable representation.

IrrigationPlan dataclass

IrrigationPlan(crop: str, planting_date: date, season_end_date: date, efficiency: float, total_eto_mm: float, total_precipitation_mm: float, total_effective_rain_mm: float, total_etc_mm: float, total_net_irrigation_mm: float, total_gross_irrigation_mm: float, total_applied_irrigation_mm: float, irrigation_trigger_days: int, schedule: DataFrame, balance: DataFrame)

Structured result from a crop irrigation planning workflow.

to_dict

to_dict() -> dict[str, object]

Convert the plan to a JSON-serializable dictionary.

WaPORProductivityResult dataclass

WaPORProductivityResult(metric_id: str, metric_name: str, output_unit: str, aggregate_value: float, summary: str, table: DataFrame, aquastat_context: list[AgricultureBenchmarkResult] | None = None)

Structured result from a WaPOR productivity workflow.

to_dict

to_dict() -> dict[str, object]

Return a JSON-serializable representation.

SoilWaterBalance

SoilWaterBalance(soil: SoilProperties, depletion_fraction: float = 0.5, initial_depletion: float = 0.0)

Daily soil water balance tracker.

Parameters:

Name Type Description Default
soil SoilProperties

Soil hydraulic properties.

required
depletion_fraction float

Fraction of TAW that can be depleted before stress (p, default 0.5).

0.5
initial_depletion float

Starting depletion in mm (default 0.0 = field capacity).

0.0
References

Allen et al. (1998), FAO-56 Ch. 8. ISBN 92-5-104219-5.

step

step(etc: float, precipitation: float = 0.0, irrigation: float = 0.0, runoff: float = 0.0) -> SoilWaterStatus

Advance the water balance by one day.

Parameters:

Name Type Description Default
etc float

Crop evapotranspiration (mm/day).

required
precipitation float

Daily precipitation (mm).

0.0
irrigation float

Applied irrigation (mm).

0.0
runoff float

Surface runoff (mm).

0.0

Returns:

Type Description
SoilWaterStatus

Updated soil water status.

run

run(etc_series: Series, precip_series: Series, irrigation_series: Series | None = None) -> pd.DataFrame

Run the water balance over a time series.

Parameters:

Name Type Description Default
etc_series Series

Daily ETc (mm/day) with DatetimeIndex.

required
precip_series Series

Daily precipitation (mm) with DatetimeIndex.

required
irrigation_series Series | None

Daily irrigation (mm). If None, no irrigation is applied.

None

Returns:

Type Description
DataFrame

Daily soil water status records.

auto_irrigate

auto_irrigate(etc_series: Series, precip_series: Series, efficiency: float = 0.7) -> pd.DataFrame

Run balance with automatic irrigation when depletion exceeds RAW.

Parameters:

Name Type Description Default
etc_series Series

Daily ETc (mm/day).

required
precip_series Series

Daily precipitation (mm).

required
efficiency float

Irrigation system efficiency (0–1).

0.7

Returns:

Type Description
DataFrame

Daily water balance with irrigation_mm column.

benchmark_aquastat

benchmark_aquastat(df: DataFrame, metric_id: str, *, year: int | None = None, countries: list[str] | None = None, latest_only: bool = True, top_n: int | None = None) -> AgricultureBenchmarkResult

Compute a country-scale benchmark from AQUASTAT data.

list_benchmark_metrics

list_benchmark_metrics() -> list[str]

Return supported benchmark metric IDs.

crop_water_requirement

crop_water_requirement(eto_series: Series, crop: str, planting_date: date, stage_lengths: dict[str, int] | None = None) -> pd.DataFrame

Compute daily crop water requirement over the growing season.

Parameters:

Name Type Description Default
eto_series Series

Daily reference ET (mm/day) with a DatetimeIndex.

required
crop str

Crop name (key in KC_TABLE).

required
planting_date date

Planting or sowing date.

required
stage_lengths dict[str, int] | None

Days per stage. Defaults to DEFAULT_STAGE_LENGTHS[crop].

None

Returns:

Type Description
DataFrame

Columns: date, stage, kc, eto, etc.

References

Allen et al. (1998), FAO-56 Ch. 6. ISBN 92-5-104219-5.

get_kc

get_kc(crop: str, stage: str | None = None) -> float | dict[str, float]

Get crop coefficient(s) for a crop from FAO-56 Table 12.

Parameters:

Name Type Description Default
crop str

Crop name (must be a key in KC_TABLE).

required
stage str | None

Growth stage: "initial", "mid", or "late". If None, returns all coefficients as a dict.

None

Returns:

Type Description
float | dict[str, float]

Kc value for the given stage, or a dict of all stages.

Raises:

Type Description
ValueError

If crop or stage is unknown.

References

Allen et al. (1998), Table 12. ISBN 92-5-104219-5.

irrigation_schedule

irrigation_schedule(eto_series: Series, precip_series: Series, crop: str, planting_date: date, efficiency: float = 0.7, stage_lengths: dict[str, int] | None = None) -> pd.DataFrame

Full irrigation scheduling over the growing season.

Parameters:

Name Type Description Default
eto_series Series

Daily reference ET (mm/day) with a DatetimeIndex.

required
precip_series Series

Daily precipitation (mm) with a DatetimeIndex.

required
crop str

Crop name.

required
planting_date date

Planting date.

required
efficiency float

Irrigation system efficiency (0–1).

0.7
stage_lengths dict[str, int] | None

Days per stage.

None

Returns:

Type Description
DataFrame

Columns: date, stage, kc, eto, etc, effective_rain, net_irrigation, gross_irrigation.

References

Allen et al. (1998), FAO-56 Ch. 7. ISBN 92-5-104219-5.

hargreaves

hargreaves(t_min: float, t_max: float, ra: float) -> float

Hargreaves reference ET₀ estimate (temperature-based).

A simpler alternative when only temperature data is available::

ET₀ = 0.0023 × (T_mean + 17.8) × (T_max − T_min)^0.5 × Ra

Parameters:

Name Type Description Default
t_min float

Minimum daily temperature (°C).

required
t_max float

Maximum daily temperature (°C).

required
ra float

Extraterrestrial radiation (MJ/m²/day). Use extraterrestrial_radiation to compute this.

required

Returns:

Type Description
float

Reference evapotranspiration ET₀ in mm/day.

References

Hargreaves, G. H. & Samani, Z. A. (1985). Reference crop evapotranspiration from temperature. Applied Engineering in Agriculture, 1(2), 96–99.

penman_monteith_daily

penman_monteith_daily(t_min: float, t_max: float, rh_min: float, rh_max: float, u2: float, rs: float, latitude: float, elevation: float, doy: int) -> float

FAO-56 Penman-Monteith daily reference evapotranspiration.

Parameters:

Name Type Description Default
t_min float

Minimum daily temperature (°C).

required
t_max float

Maximum daily temperature (°C).

required
rh_min float

Minimum relative humidity (%).

required
rh_max float

Maximum relative humidity (%).

required
u2 float

Wind speed at 2 m height (m/s).

required
rs float

Incoming solar radiation (MJ/m²/day).

required
latitude float

Latitude in decimal degrees.

required
elevation float

Station elevation above sea level (m).

required
doy int

Day of the year (1–366).

required

Returns:

Type Description
float

Reference evapotranspiration ET₀ in mm/day.

References

Allen et al. (1998), Eq. 6. ISBN 92-5-104219-5.

default_season_end_date

default_season_end_date(crop: str, planting_date: date, stage_lengths: dict[str, int] | None = None) -> date

Return the default end date for a crop season.

fetch_openmeteo_plan_inputs

fetch_openmeteo_plan_inputs(latitude: float, longitude: float, start_date: str, end_date: str) -> tuple[pd.Series, pd.Series]

Fetch daily ET0 and precipitation from Open-Meteo for planning.

plan_irrigation

plan_irrigation(crop: str, planting_date: date, eto_series: Series, precip_series: Series, soil: SoilProperties, *, efficiency: float = 0.7, depletion_fraction: float = 0.5, initial_depletion: float = 0.0, stage_lengths: dict[str, int] | None = None) -> IrrigationPlan

Build a full irrigation plan from daily ET and precipitation series.

estimate_wapor_productivity

estimate_wapor_productivity(*, metric_id: str, aeti_df: DataFrame | None = None, npp_df: DataFrame | None = None, ret_df: DataFrame | None = None, aquastat_df: DataFrame | None = None, aquastat_metrics: list[str] | None = None, aquastat_year: int | None = None, aquastat_countries: list[str] | None = None, aquastat_top_n: int | None = 10) -> WaPORProductivityResult

Compute a WaPOR productivity or ET performance metric.

list_productivity_metrics

list_productivity_metrics() -> list[str]

Return supported productivity metric IDs.


Visualization

aquascope.viz

AquaScope visualisation module.

Provides publication-quality plots for water-quality analysis, hydrology, forecasting, and spatial data. All functions lazily import matplotlib so the module can be imported even when the viz optional dependency group is not installed — an ImportError is raised only when a plot function is actually called.

Quick start::

from aquascope.viz import plot_timeseries, plot_forecast, plot_fdc

plot_timeseries(df, title="Daily Discharge")
plot_forecast(observed=train, forecast=pred, save_path="forecast.png")
plot_fdc(discharge_series, save_path="fdc.svg")

diagnostic_panel

diagnostic_panel(observed, distribution: str, params: tuple, result: FloodFreqResult | None = None, *, save_path: str | None = None) -> Figure

4-panel diagnostic: Q-Q, P-P, return level, density comparison.

Parameters:

Name Type Description Default
observed array - like

Observed data (e.g., annual maximum discharge).

required
distribution str

Distribution name.

required
params tuple

Distribution parameters from scipy.stats fit.

required
result FloodFreqResult

If provided, the return level panel is drawn using this result. Otherwise, the return level panel is replaced with a histogram.

None
save_path str

If given, save the composite figure to this path.

None

Returns:

Type Description
Figure

The matplotlib Figure with four subplots.

pp_plot

pp_plot(observed, distribution: str, params: tuple, *, ax: Axes | None = None, save_path: str | None = None, title: str | None = None) -> Figure

Probability-Probability plot.

Compares empirical CDFs with the theoretical CDF of the fitted distribution.

Parameters:

Name Type Description Default
observed array - like

Observed data (e.g., annual maximum discharge).

required
distribution str

Distribution name: "gev", "lp3", "gumbel", "weibull", "gpd".

required
params tuple

Distribution parameters from scipy.stats fit.

required
ax Axes

Matplotlib axes to draw on.

None
save_path str

If given, save the figure to this path.

None
title str

Plot title.

None

Returns:

Type Description
Figure

The matplotlib Figure containing the P-P plot.

qq_plot

qq_plot(observed, distribution: str, params: tuple, *, ax: Axes | None = None, save_path: str | None = None, title: str | None = None) -> Figure

Quantile-Quantile plot comparing observed data against a fitted distribution.

Parameters:

Name Type Description Default
observed array - like

Observed data (e.g., annual maximum discharge).

required
distribution str

Distribution name: "gev", "lp3", "gumbel", "weibull", "gpd".

required
params tuple

Distribution parameters (shape, loc, scale) from scipy.stats fit.

required
ax Axes

Matplotlib axes to draw on. A new figure is created when None.

None
save_path str

If given, save the figure to this path.

None
title str

Plot title. Defaults to "Q-Q Plot (<dist>)".

None

Returns:

Type Description
Figure

The matplotlib Figure containing the Q-Q plot.

return_level_plot

return_level_plot(result: FloodFreqResult, *, ci: bool = True, ax: Axes | None = None, save_path: str | None = None) -> Figure

Return level plot with confidence intervals.

Shows return period (x-axis, log-scale) versus discharge (y-axis) with optional confidence-interval bands.

Parameters:

Name Type Description Default
result FloodFreqResult

Result from a flood frequency fit (e.g., fit_gev).

required
ci bool

Whether to draw the confidence-interval band (default True).

True
ax Axes

Matplotlib axes to draw on.

None
save_path str

If given, save the figure to this path.

None

Returns:

Type Description
Figure

The matplotlib Figure containing the return level plot.

plot_fdc

plot_fdc(discharge: Series, *, title: str = 'Flow Duration Curve', ylabel: str = 'Discharge (m³/s)', log_scale: bool = True, percentiles: list[float] | None = None, figsize: tuple[float, float] = DEFAULT_FIGSIZE, save_path: str | None = None) -> Figure

Plot a flow duration curve.

Parameters:

Name Type Description Default
discharge Series

Series of discharge values.

required
title str

Axis labels.

'Flow Duration Curve'
ylabel str

Axis labels.

'Flow Duration Curve'
log_scale bool

If True, use a log scale on the y-axis.

True
percentiles list[float] | None

Exceedance percentiles to annotate (e.g. [5, 50, 95]).

None
figsize tuple[float, float]

Figure size.

DEFAULT_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_hydrograph

plot_hydrograph(discharge: DataFrame, *, total_col: str = 'discharge', baseflow_col: str | None = 'baseflow', precip_col: str | None = None, title: str = 'Hydrograph', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure

Plot a hydrograph with optional baseflow and precipitation overlay.

Parameters:

Name Type Description Default
discharge DataFrame

DataFrame with DatetimeIndex and at least a total discharge column.

required
total_col str

Column name for total discharge.

'discharge'
baseflow_col str | None

Column name for baseflow (shaded underneath). None to skip.

'baseflow'
precip_col str | None

Column for inverted precipitation bars on a secondary y-axis.

None
title str

Plot title.

'Hydrograph'
figsize tuple[float, float]

Figure size.

WIDE_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_return_periods

plot_return_periods(return_periods: dict[int, float], *, observed_max: float | None = None, title: str = 'Flood Return Periods', ylabel: str = 'Discharge (m³/s)', figsize: tuple[float, float] = DEFAULT_FIGSIZE, save_path: str | None = None) -> Figure

Plot return period estimates with optional observed maximum.

Parameters:

Name Type Description Default
return_periods dict[int, float]

Mapping of return period (years) to estimated discharge.

required
observed_max float | None

If given, draw a horizontal line at the observed maximum.

None
title str

Axis labels.

'Flood Return Periods'
ylabel str

Axis labels.

'Flood Return Periods'
figsize tuple[float, float]

Figure size.

DEFAULT_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_spi_timeline

plot_spi_timeline(spi_df: DataFrame, *, spi_col: str = 'spi_3', title: str = 'SPI Drought Timeline', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure

Plot SPI values as a bar chart coloured by drought severity.

Parameters:

Name Type Description Default
spi_df DataFrame

DataFrame with DatetimeIndex and SPI column(s).

required
spi_col str

Which SPI column to plot.

'spi_3'
title str

Plot title.

'SPI Drought Timeline'
figsize tuple[float, float]

Figure size.

WIDE_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_boxplot

plot_boxplot(df: DataFrame, *, value_col: str = 'value', group_col: str = 'station_name', title: str = 'Distribution by Group', ylabel: str = 'Value', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure

Box plot of value_col grouped by group_col.

Parameters:

Name Type Description Default
df DataFrame

Long-format DataFrame with at least value_col and group_col.

required
value_col str

Column containing measurement values.

'value'
group_col str

Column to group by (station, parameter, etc.).

'station_name'
title str

Axis labels.

'Distribution by Group'
ylabel str

Axis labels.

'Distribution by Group'
figsize tuple[float, float]

Figure size.

WIDE_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_eda_summary

plot_eda_summary(report, *, title: str = 'EDA Summary', figsize: tuple[float, float] = MULTI_FIGSIZE, save_path: str | None = None) -> Figure

Multi-panel summary of an EDAReport.

Panels: 1. Record count per parameter (bar) 2. Missing data (bar) 3. Outlier count (bar) 4. Value ranges (error bar: mean ± std)

Parameters:

Name Type Description Default
report

An EDAReport instance from aquascope.analysis.eda.

required
title str

Super-title.

'EDA Summary'
figsize tuple[float, float]

Figure size.

MULTI_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_heatmap

plot_heatmap(df: DataFrame, *, title: str = 'Correlation Heatmap', figsize: tuple[float, float] = (10, 8), cmap: str = 'RdYlBu_r', save_path: str | None = None) -> Figure

Heatmap of the correlation matrix of numeric columns.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with numeric columns.

required
title str

Plot title.

'Correlation Heatmap'
figsize tuple[float, float]

Figure size.

(10, 8)
cmap str

Colour map name.

'RdYlBu_r'
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_param_comparison

plot_param_comparison(df: DataFrame, *, value_col: str = 'value', param_col: str = 'parameter', station_col: str = 'station_name', title: str = 'Parameter Comparison', figsize: tuple[float, float] = MULTI_FIGSIZE, save_path: str | None = None) -> Figure

Grid of box plots — one per parameter, grouped by station.

Parameters:

Name Type Description Default
df DataFrame

Long-format DataFrame with measurements.

required
value_col str

Column with measurement values.

'value'
param_col str

Column with parameter names.

'parameter'
station_col str

Column with station names.

'station_name'
title str

Super-title.

'Parameter Comparison'
figsize tuple[float, float]

Figure size.

MULTI_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_who_exceedances

plot_who_exceedances(who_df: DataFrame, *, title: str = 'WHO Guideline Exceedances', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure

Horizontal bar chart of WHO guideline exceedance percentages.

Parameters:

Name Type Description Default
who_df DataFrame

DataFrame returned by WaterQualityChallenge.check_who_guidelines(). Expected columns: variable, pct_exceedances, status.

required
title str

Plot title.

'WHO Guideline Exceedances'
figsize tuple[float, float]

Figure size.

WIDE_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_station_map

plot_station_map(stations: DataFrame, *, lat_col: str = 'latitude', lon_col: str = 'longitude', label_col: str = 'station_name', value_col: str | None = None, colour_col: str | None = None, title: str = 'Station Map', save_path: str | None = None) -> object

Create an interactive Folium map of monitoring stations.

Parameters:

Name Type Description Default
stations DataFrame

DataFrame with at least latitude/longitude columns.

required
lat_col str

Column names for coordinates.

'latitude'
lon_col str

Column names for coordinates.

'latitude'
label_col str

Column used for popup labels.

'station_name'
value_col str | None

Optional column whose value is shown in tooltips.

None
colour_col str | None

Optional column for colour-coding markers (e.g. risk level). Values are mapped through RISK_COLOURS; unrecognised values use AquaScope primary blue.

None
title str

Map title (shown in popup header).

'Station Map'
save_path str | None

If provided, save the HTML map to this path.

None

Returns:

Type Description
A ``folium.Map`` object.

plot_station_scatter

plot_station_scatter(stations: DataFrame, *, lat_col: str = 'latitude', lon_col: str = 'longitude', label_col: str = 'station_name', value_col: str | None = None, title: str = 'Station Locations', cmap: str = 'YlOrRd', figsize: tuple[float, float] = DEFAULT_FIGSIZE, save_path: str | None = None) -> Figure

Static scatter plot of station locations coloured by value.

Parameters:

Name Type Description Default
stations DataFrame

DataFrame with latitude/longitude and optional value column.

required
lat_col str

Column names for coordinates.

'latitude'
lon_col str

Column names for coordinates.

'latitude'
label_col str

Column for point labels.

'station_name'
value_col str | None

If provided, colour-code by this column and add a colour bar.

None
title str

Plot title.

'Station Locations'
cmap str

Matplotlib colour map name (used when value_col is set).

'YlOrRd'
figsize tuple[float, float]

Figure size.

DEFAULT_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

apply_aqua_style

apply_aqua_style() -> None

Apply the AquaScope matplotlib style globally.

Sets a clean, publication-friendly style with the AquaScope colour palette. Safe to call multiple times.

plot_forecast

plot_forecast(observed: DataFrame | None = None, forecast: DataFrame | None = None, *, obs_col: str = 'value', pred_col: str = 'yhat', lower_col: str = 'yhat_lower', upper_col: str = 'yhat_upper', title: str = 'Forecast', ylabel: str = 'Value', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure

Plot observed data with forecast and confidence interval bands.

Parameters:

Name Type Description Default
observed DataFrame | None

Historical DataFrame (DatetimeIndex + value column).

None
forecast DataFrame | None

Forecast DataFrame with yhat, yhat_lower, yhat_upper.

None
obs_col str

Column name in observed.

'value'
pred_col str

Column names in forecast.

'yhat'
lower_col str

Column names in forecast.

'yhat'
upper_col str

Column names in forecast.

'yhat'
title str

Axis labels.

'Forecast'
ylabel str

Axis labels.

'Forecast'
figsize tuple[float, float]

Figure size.

WIDE_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_multi_param

plot_multi_param(df: DataFrame, *, columns: list[str] | None = None, title: str = 'Multi-Parameter Time Series', ylabel: str = 'Value', figsize: tuple[float, float] = WIDE_FIGSIZE, save_path: str | None = None) -> Figure

Overlay multiple columns on the same axes.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with DatetimeIndex and one or more numeric columns.

required
columns list[str] | None

Subset of column names to plot. None plots all numeric columns.

None
title str

Axis labels.

'Multi-Parameter Time Series'
ylabel str

Axis labels.

'Multi-Parameter Time Series'
figsize tuple[float, float]

Figure size.

WIDE_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_observed_vs_predicted

plot_observed_vs_predicted(observed: Series, predicted: Series, *, metrics: dict | None = None, title: str = 'Observed vs Predicted', figsize: tuple[float, float] = (7, 7), save_path: str | None = None) -> Figure

Scatter plot of observed vs predicted values with 1:1 line.

Parameters:

Name Type Description Default
observed Series

Observed values.

required
predicted Series

Predicted values (same index as observed).

required
metrics dict | None

Optional dict of evaluation metrics (NSE, KGE, …) to annotate.

None
title str

Plot title.

'Observed vs Predicted'
figsize tuple[float, float]

Figure size.

(7, 7)
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_residuals

plot_residuals(observed: Series, predicted: Series, *, title: str = 'Residuals', figsize: tuple[float, float] = DEFAULT_FIGSIZE, save_path: str | None = None) -> Figure

Plot residuals (observed − predicted) over the index.

Parameters:

Name Type Description Default
observed Series

Observed values.

required
predicted Series

Predicted values.

required
title str

Plot title.

'Residuals'
figsize tuple[float, float]

Figure size.

DEFAULT_FIGSIZE
save_path str | None

Optional save path.

None

Returns:

Type Description
The matplotlib Figure.

plot_timeseries

plot_timeseries(df: DataFrame, *, value_col: str = 'value', title: str = 'Time Series', ylabel: str = 'Value', xlabel: str = 'Date', colour: str | None = None, ax: Axes | None = None, figsize: tuple[float, float] = DEFAULT_FIGSIZE, save_path: str | None = None) -> Figure

Plot a single time-series from a DatetimeIndex DataFrame.

Parameters:

Name Type Description Default
df DataFrame

DataFrame with a DatetimeIndex and at least one value column.

required
value_col str

Column name containing the values to plot.

'value'
title str

Axis labels.

'Time Series'
ylabel str

Axis labels.

'Time Series'
xlabel str

Axis labels.

'Time Series'
colour str | None

Line colour. Defaults to AquaScope primary blue.

None
ax Axes | None

Optional pre-existing Axes to draw on.

None
figsize tuple[float, float]

Figure size if creating a new figure.

DEFAULT_FIGSIZE
save_path str | None

If provided, save figure to this path instead of showing.

None

Returns:

Type Description
The matplotlib Figure.