Adding a New Research Methodology¶

AquaScope's AI recommender draws from a curated knowledge base of research methodologies. Adding a new one involves two steps: knowledge base entry and (optionally) a pipeline implementation.

Step 1: Add to the Knowledge Base¶

Edit aquascope/ai_engine/knowledge_base.py and append a new ResearchMethodology to the METHODOLOGIES list:

ResearchMethodology(
    id="your_method_id",                     # Unique snake_case identifier
    name="Your Method Name",                  # Human-readable name
    category="statistical",                   # One of: statistical, machine_learning,
                                             #   process_engineering, remote_sensing,
                                             #   hydrological_modelling, policy_analysis
    description="A brief description of what this method does and when to use it.",
    applicable_parameters=[                   # Water quality parameters this applies to
        "DO", "BOD5", "COD", "pH", "SS",
    ],
    data_requirements=[                       # What data is needed
        "time-series > 2 years",
        "multiple stations",
    ],
    typical_scale="regional",                 # lab / pilot / field / regional / global
    complexity="medium",                      # low / medium / high
    references=[                              # Key academic references
        "Author et al. (2020). Title. Journal, 1(2), 3-4. DOI: ...",
    ],
    tags=[                                    # Search tags for matching
        "keyword1", "keyword2", "keyword3",
    ],
),

Field Guidelines¶

Field	Purpose	Tips
`id`	Used by pipelines and CLI	Must be unique, use snake_case
`applicable_parameters`	Helps the recommender match datasets	Use standard abbreviations (DO, BOD5, COD, etc.)
`data_requirements`	Shown in recommendations	Be specific about what the method needs
`tags`	Used for keyword matching in recommendations	Include synonyms, related terms
`complexity`	Helps researchers choose appropriate methods	"low" = basic stats, "high" = advanced ML/numerical

Step 2: (Optional) Add a Pipeline Implementation¶

If you want users to be able to auto-run your methodology, add a pipeline:

2a. Create the pipeline function¶

Add to aquascope/pipelines/model_builder.py:

def run_your_method(df: pd.DataFrame, config: dict | None = None) -> PipelineResult:
    """Your method description."""
    config = config or {}
    # ... implementation ...

    return PipelineResult(
        method_id="your_method_id",
        method_name="Your Method Name",
        summary="Human-readable summary of results.",
        metrics={"key_metric": value},
        details={"raw_results": {...}},
    )

2b. Register in the pipeline dispatcher¶

Add to PIPELINE_REGISTRY:

PIPELINE_REGISTRY: dict[str, callable] = {
    # ... existing pipelines ...
    "your_method_id": run_your_method,
}

Step 3: Write Tests¶

Add tests in tests/test_pipelines/test_model_builder.py or a new test file:

def test_your_method():
    df = _make_sample_data()
    result = run_your_method(df)
    assert isinstance(result, PipelineResult)
    assert result.method_id == "your_method_id"

Step 4: Verify¶

# Run tests
pytest tests/ -v

# Check lint
ruff check .

# Verify the methodology appears
aquascope list-methods

Examples of Good Methodologies to Add¶

Numerical methods — Finite element analysis for groundwater flow, PDE-based contaminant transport
Forecasting — Prophet time-series, Transformer-based water quality prediction
Process models — QUAL2K, WASP, MIKE, HEC-RAS
Machine learning — GNN for sensor networks, autoencoders for anomaly detection
Field methods — Isotope tracing, sediment analysis, bioassays