Skip to content

CLI Guide

Primary Commands

  • app.cli.prepare_real_incidents: builds experiment-ready incidents from raw article JSON.
  • app.cli.run_experiments: executes all model-condition-request combinations and writes run artifacts.
  • app.cli.generate_report_assets: computes figures and markdown/json analysis tables from saved runs.
  • app.cli.generate_llm_dashboard_summary: generates offline LLM narrative summary for dashboard cards.
  • app.cli.probe_model: single-prompt health check against Ollama model.
  • app.cli.list_models: lists local Ollama models.
  • app.cli.benchmark_models: quick latency comparison across models.

Operational Tips

  • Keep one output folder per run for reproducibility.
  • Use deterministic seeds for comparable experiment slices.
  • Regenerate report assets and LLM summary after adding new runs.
  • Treat parse_failure and fallback records as first-class diagnostics.

Runtime Optimization Flags (run_experiments)

The experiment runner supports optional low-level runtime options for Ollama:

  • --enable-flash-attention
  • --enable-kv-cache
  • --kv-cache-type <value>

Example:

uv run python -m app.cli.run_experiments \
    --input data/real_incidents_all.jsonl \
    --models-manifest configs/models.example.yaml \
    --output-dir outputs \
    --enable-flash-attention \
    --enable-kv-cache \
    --kv-cache-type q8_0

Defaults intentionally keep these unset so existing workflows and historical output reproducibility stay unchanged.