avocet/app
pyr0ball bce932461a feat: plans benchmark harness — model scoring for CF planning prompts
Adds benchmark_plans.py script, plans_bench API router, PlansBenchTab Vue
component, and registers /api/plans-bench in api.py. Also extends models
registry (cf-text catalog integration), cforch client, LlmEvalTab, and
ModelsView with cf-orch fleet support. Wires Planning mode into BenchmarkView.
2026-05-02 23:36:04 -07:00
..
data fix: align train job/results API envelope, config_json key, progress SSE, dashboard model_key 2026-05-02 21:22:18 -07:00
eval feat: build app/eval/cforch.py aggregating eval benchmark routers 2026-05-01 22:23:06 -07:00
train fix: align train job/results API envelope, config_json key, progress SSE, dashboard model_key 2026-05-02 21:22:18 -07:00
api.py feat: plans benchmark harness — model scoring for CF planning prompts 2026-05-02 23:36:04 -07:00
cforch.py feat: plans benchmark harness — model scoring for CF planning prompts 2026-05-02 23:36:04 -07:00
cloud_session.py refactor: import detect_byok from cf-core, remove local copy 2026-04-25 16:45:47 -07:00
dashboard.py fix: align train job/results API envelope, config_json key, progress SSE, dashboard model_key 2026-05-02 21:22:18 -07:00
imap_fetch.py feat: extract fetch routes and IMAP helpers into app/data/fetch.py 2026-05-01 21:57:31 -07:00
imitate.py feat: move imitate API into app/data/imitate.py 2026-05-01 22:12:19 -07:00
models.py feat: plans benchmark harness — model scoring for CF planning prompts 2026-05-02 23:36:04 -07:00
plans_bench.py fix: restore real plans_bench.py (was accidentally stubbed) 2026-05-01 22:25:22 -07:00
sft.py feat: move SFT corrections API into app/data/corrections.py 2026-05-01 22:02:22 -07:00
style.py refactor(bench): extract benchmark tabs — classifier, compare, llm-eval, style, voice 2026-04-24 14:56:17 -07:00
utils.py fix: restore ensure_ascii=False in utils jsonl helpers; remove dead _last_action from api.py 2026-05-01 20:59:44 -07:00
voice.py refactor(bench): extract benchmark tabs — classifier, compare, llm-eval, style, voice 2026-04-24 14:56:17 -07:00