avocet/app
pyr0ball 9fdaeeb3d6 feat: multi-bench dashboard, API path migration, benchmark reliability fixes
- dashboard: eval card now shows last run + score for all bench types
  (classifier, LLM, style, plans) via new _get_recent_bench_runs()
- dashboard: skip cforch LLM-bench list summaries when scanning for
  classifier best_macro_f1 (fixes _find_latest_classifier_bench)
- cforch: stale _BENCH_RUNNING flag now auto-resets if process exited;
  idle timeout (120s via select) kills hung benchmark if node crashes
- api: add /api/finetune/{run,cancel} backward-compat shims while
  ClassifierTab fine-tune section is migrated to TrainJobsView
- ClassifierTab: migrate all /api/benchmark/* paths to /api/cforch/*;
  fix null-safety on results.models access; load fine-tuned models from
  /api/train/results instead of /api/finetune/status
- CompareTab: extend model picker to include vllm + cf-text alongside
  ollama, grouped by service; pre-select all LLM_SERVICES on load
- LlmEvalTab: null-safety on quality_by_task_type lookups
- models: AVOCET_MODELS_DIR env var overrides default models/ path
2026-05-11 09:05:12 -07:00
..
data fix: align train job/results API envelope, config_json key, progress SSE, dashboard model_key 2026-05-02 21:22:18 -07:00
eval feat: add embed-bench rate and export endpoints 2026-05-11 08:07:17 -07:00
train fix: align train job/results API envelope, config_json key, progress SSE, dashboard model_key 2026-05-02 21:22:18 -07:00
api.py feat: multi-bench dashboard, API path migration, benchmark reliability fixes 2026-05-11 09:05:12 -07:00
cforch.py feat: multi-bench dashboard, API path migration, benchmark reliability fixes 2026-05-11 09:05:12 -07:00
cloud_session.py refactor: import detect_byok from cf-core, remove local copy 2026-04-25 16:45:47 -07:00
dashboard.py feat: multi-bench dashboard, API path migration, benchmark reliability fixes 2026-05-11 09:05:12 -07:00
imap_fetch.py feat: extract fetch routes and IMAP helpers into app/data/fetch.py 2026-05-01 21:57:31 -07:00
imitate.py feat: move imitate API into app/data/imitate.py 2026-05-01 22:12:19 -07:00
models.py feat: multi-bench dashboard, API path migration, benchmark reliability fixes 2026-05-11 09:05:12 -07:00
nodes.py fix: move json import to module scope in nodes.py 2026-05-05 21:01:32 -07:00
plans_bench.py fix: restore real plans_bench.py (was accidentally stubbed) 2026-05-01 22:25:22 -07:00
sft.py feat: move SFT corrections API into app/data/corrections.py 2026-05-01 22:02:22 -07:00
style.py refactor(bench): extract benchmark tabs — classifier, compare, llm-eval, style, voice 2026-04-24 14:56:17 -07:00
utils.py fix: restore ensure_ascii=False in utils jsonl helpers; remove dead _last_action from api.py 2026-05-01 20:59:44 -07:00
voice.py refactor(bench): extract benchmark tabs — classifier, compare, llm-eval, style, voice 2026-04-24 14:56:17 -07:00