Commit graph

203 commits

Author SHA1 Message Date
391ebb3cd1 feat(recipe-scan): labeling UI for Kiwi vision training pipeline (closes #65)
- POST /api/recipe-scan/import — bulk ingest from Kiwi scanner pipeline, idempotent by item id
- GET /api/recipe-scan/next — oldest-first pending item for review
- POST /api/recipe-scan/items/{id}/approve|edit|reject — label actions
- GET /api/recipe-scan/stats — counts by status and modality
- GET /api/recipe-scan/export — JSONL training pairs (messages chat format, Option B: correction prompt + extracted draft → corrected ground truth)
- GET /api/recipe-scan/image — path-traversal-safe image serving from /Library/Assets/kiwi/
- SQLite at data/recipe_scan.db with WAL mode; separate from corpus.db lifecycle
- set_db_path() testability seam; 18 tests, all passing
- RecipeScanView.vue: two-column review UI (image left, JSON diff right), keyboard shortcuts A/E/R, toast feedback, stats header, export download
- Route /data/recipe-scan and sidebar nav entry added
2026-05-17 12:22:15 -07:00
9bb88b168f feat(corpus): pipeline log ingest from shared dir (closes #67)
Pull-side companion to kiwi#141. Ingests structured JSONL pipeline logs
from /Library/Assets/logs/pipeline/ into the log corpus for Turnstone
logreading model training.

- app/data/log_corpus.py: add ingested_pipeline_files tracking table,
  _pipeline_ingest_dir() config helper, _ingest_one_file() parser, and
  POST /api/corpus/pipeline-ingest endpoint
- source_host = "pipeline_scrape"; source_id from logger field; extra
  dict stored as matched_patterns; batch_type = "pipeline_log"
- Idempotent by filename: skips files already in ingested_pipeline_files
- config/label_tool.yaml.example: add corpus section with pipeline_ingest_dir
  and push sources comment block
- tests/test_log_corpus.py: 8 new tests covering ingest, idempotency,
  non-JSONL filtering, malformed line resilience, incremental runs
2026-05-17 11:28:33 -07:00
13ca082a43 chore(models): refresh model registries with current cluster catalog
Replace stale llama/mistral/phi model refs with models active on the
cluster: deepseek-r1 (1.5b, 7b-4bit, 0528-qwen3-8b-gguf), granite-4.1-8b,
qwen2.5 (3b, 7b), capybarahermes-2.5-mistral-7b, darwin-9b-opus. Update
benchmark_plans.py doc examples to match.
2026-05-17 11:24:03 -07:00
d416ef8aa4 feat(imitate): task-model assignment routing via cf-orch
Add _resolve_task_model() helper that looks up a product.task assignment
from the coordinator and resolves its service_type from the model registry.
Add task_ids param to run_imitate() (comma-separated "product/task" strings)
so the imitate harness can dispatch to models chosen by the assignment layer
rather than requiring explicit model IDs.
2026-05-17 11:23:55 -07:00
79b9ccbd3d feat(fleet): profile editor, assignments tab, node management polish
Backend:
- app/nodes.py: fix coordinator response envelope (.get("nodes"/"services"))
- app/nodes.py: add PUT /nodes/{id}/profile (atomic YAML write + reload)
- app/nodes.py: add POST /nodes/{id}/profile/generate (coordinator-seeded skeleton)
- tests/test_nodes.py: fix mock envelopes; add deploy model + profile tests

Frontend:
- NodeManagementView: tab bar switching nodes / assignments panels
- AssignmentsTab: full product.task → model routing UI (add/edit/delete)
- ProfileEditorPanel: full YAML profile editor with GPU + service sections
- CatalogEntryFormModal: add/edit model catalog entries per service
- ServiceFormModal: add/edit service config blocks
- NodeCard, GpuRow, ServiceBadge, OllamaModelPanel, HfNodeModelPanel: polish pass
- ModelsView: model download additions
- nodes.ts: extend types for full profile editing (ServiceManaged, CatalogEntryFull)
2026-05-17 11:23:47 -07:00
e93afec271 fix(tests): resolve 5 pre-existing test failures on main (closes #56)
- app/models.py: add set_cf_text_models_dir() testability seam
- tests/test_models.py: redirect _CF_TEXT_MODELS_DIR in reset_models_globals
  fixture so list_installed() count tests are not polluted by real NFS models
- app/cforch.py: fix get_results() return type annotation list → dict
- tests/test_cforch.py: give _BENCH_RUNNING=True test a mock proc with
  poll()=None so the stale-flag check correctly returns 409; patch
  _select.select in streaming tests (select requires fileno(), iter() doesn't)
- tests/test_finetune.py: mark GPU integration test @pytest.mark.gpu
- pytest.ini: register gpu and slow markers
2026-05-17 11:21:58 -07:00
cac91dd8a2 docs: bump version badge to match latest Forgejo release 2026-05-17 11:19:13 -07:00
2b990a603a feat: log corpus receiver — accept Turnstone push batches and label for logreading fine-tune
Adds corpus.db (corpus_sources, corpus_batches, corpus_entries), a FastAPI router
at /api/corpus with receive/label/skip/stats/export endpoints, and seeds consent
tokens for xanderland + orchard nodes from label_tool.yaml. PII flag excludes
entries from JSONL export. Closes avocet#61.
2026-05-11 17:07:54 -07:00
9fdaeeb3d6 feat: multi-bench dashboard, API path migration, benchmark reliability fixes
- dashboard: eval card now shows last run + score for all bench types
  (classifier, LLM, style, plans) via new _get_recent_bench_runs()
- dashboard: skip cforch LLM-bench list summaries when scanning for
  classifier best_macro_f1 (fixes _find_latest_classifier_bench)
- cforch: stale _BENCH_RUNNING flag now auto-resets if process exited;
  idle timeout (120s via select) kills hung benchmark if node crashes
- api: add /api/finetune/{run,cancel} backward-compat shims while
  ClassifierTab fine-tune section is migrated to TrainJobsView
- ClassifierTab: migrate all /api/benchmark/* paths to /api/cforch/*;
  fix null-safety on results.models access; load fine-tuned models from
  /api/train/results instead of /api/finetune/status
- CompareTab: extend model picker to include vllm + cf-text alongside
  ollama, grouped by service; pre-select all LLM_SERVICES on load
- LlmEvalTab: null-safety on quality_by_task_type lookups
- models: AVOCET_MODELS_DIR env var overrides default models/ path
2026-05-11 09:05:12 -07:00
71bf88d09b feat: implement results table, rating buttons, export UI, and a11y polish 2026-05-11 08:16:52 -07:00
bc4ca1095c feat: add embed-compare route, sidebar nav entry, and full input UI 2026-05-11 08:14:30 -07:00
b6aed3dd1b chore: add pagepiper imitate entry and embed_bench section to config example 2026-05-11 08:11:30 -07:00
1ad7ba322a feat: add embed-bench rate and export endpoints 2026-05-11 08:07:17 -07:00
32e3b2a0dd feat: add embed-bench run endpoint with SSE streaming 2026-05-07 09:05:34 -07:00
12117ad0c6 fix: narrow exception types in get_models, fix patch targets in tests, add type annotation 2026-05-07 09:03:37 -07:00
5939c67b9f feat: add embed-bench models endpoint and register router in aggregator 2026-05-07 09:01:25 -07:00
5ea77da97d fix: add _cosine dimension guard, fix return type annotation, add zero-vector test 2026-05-07 08:59:24 -07:00
276bdadb92 feat: add embed_bench module scaffold and _cosine() helper 2026-05-07 08:37:18 -07:00
6f9aad126e docs(readme): landing page rewrite — three-stage pipeline explained, full CLI reference, data flow diagram, label table 2026-05-06 08:51:46 -07:00
258bbdc0af chore(deps): fix 10 Dependabot CVEs — vite 7.3.2, defu 6.1.7, yaml 2.8.4, picomatch 4.0.4, undici 7.25.0 2026-05-06 08:41:05 -07:00
32872d1ec6 fix: assigned-only state, remove dead HfNodeModelPanel prop, deduplicate yaml example 2026-05-05 22:11:02 -07:00
1521198cb1 fix: code quality fixes from review (SSE abort, aria-live, shared types, type safety)
- Add AbortController to SSE pull stream in OllamaModelPanel; abort on unmount
- Fix SSE loop: break on success/error events, call fetchModels() after the loop
- Add AbortController to fetchModels() and fetchProfile() one-shot fetches
- Add onUnmounted cleanup to both panel components
- Extract GpuEntry, ServiceInfo, NodeSummary to web/src/types/nodes.ts
- Remove duplicate interface definitions from NodeCard, GpuRow, NodeManagementView
- Fix aria-live regions: persistent container with v-if on inner span (avoids
  screen reader announcement miss on initial mount)
- Tighten STATE_LABELS/STATE_ICONS to Record<ServiceState, string> for exhaustiveness
- Add explicit (await r.json()) as NodeSummary[] cast in fetchNodes()
2026-05-05 21:35:13 -07:00
8dda040480 fix: move /nodes route immediately after /fleet per spec 2026-05-05 21:29:35 -07:00
bf675ed1f6 feat: add OllamaModelPanel and HfNodeModelPanel Vue components 2026-05-05 21:24:38 -07:00
0efd1aedbe feat: add NodeCard, GpuRow, ServiceBadge Vue components 2026-05-05 21:24:32 -07:00
4c225b94f5 feat: add /nodes route, AppSidebar nav item, and NodeManagementView 2026-05-05 21:24:27 -07:00
1cd9c5d455 fix: move json import to module scope in nodes.py 2026-05-05 21:01:32 -07:00
5702a7190b feat: add Ollama list/pull-SSE/delete endpoints 2026-05-05 20:41:29 -07:00
55b017ba3b fix: log coordinator reload failures in update_gpu_services
- Replace bare `except Exception: pass` with `except Exception as exc` and a
  logger.warning call that surfaces node_id and the exception for diagnostics.
- Move `import os as _os` from mid-file (between test functions) to the
  top-level import block to satisfy PEP 8 and linter expectations.
2026-05-05 20:36:08 -07:00
f952ec8971 feat: add profile endpoint and GPU service assignment with compatibility check 2026-05-05 20:33:41 -07:00
fd8cb622a1 feat: add GET /api/nodes-mgmt/nodes/{node_id}/profile endpoint 2026-05-05 20:31:22 -07:00
47cb9f661f fix: narrow exception handling in list_nodes, move mock imports to top
- Remove redundant httpx.ConnectError from nodes except clause (it's a
  subclass of HTTPError so the tuple catch was redundant)
- Narrow services except clause from bare Exception to httpx.HTTPError,
  add logger.warning with coordinator_url for debuggability
- Move `from unittest.mock import MagicMock, patch` from mid-file to
  the top-of-file import block with the other stdlib/third-party imports
2026-05-05 20:18:50 -07:00
c2de9e53da feat: implement GET /api/nodes-mgmt/nodes with coordinator proxy and profile merge 2026-05-05 20:16:06 -07:00
c039ea4698 fix: remove unused imports and em dash in nodes.py scaffold
- Drop unused StreamingResponse import from app/nodes.py (will be
  re-added in Task 2 when the SSE endpoint is implemented)
- Replace em dash with colon in _get_ollama_url HTTPException detail
- Remove unused os and unittest.mock imports from test_nodes.py
  (mock imports will return in Task 2 tests)
2026-05-05 19:59:32 -07:00
95afddb772 feat: add nodes.py scaffold with set_config_dir and router mount
- Create app/nodes.py with _CONFIG_DIR testability seam, _load_config,
  _profiles_dir, _profile_path, _load_profile, _get_ollama_url helpers,
  and stub list_nodes endpoint returning [] when no coordinator_url is set
- Mount nodes router at /api/nodes-mgmt in app/api.py
- Add profiles_dir comment to config/label_tool.yaml.example cforch section
- Create tests/test_nodes.py with autouse fixture and two passing tests
2026-05-05 19:35:28 -07:00
cbe8c0f03e feat(benchmark): wire EmbeddingKNNAdapter into MODEL_REGISTRY; add embed_model config
- Add embed_model: nomic-embed-text to config/label_tool.yaml (local, gitignored)
- Add # embed_model: commented example to config/label_tool.yaml.example
- Add pyyaml>=6.0 to requirements.txt (explicit dep for _resolve_urls yaml.safe_load)
- Add params assertion to test_embed_knn_nomic_registry_entry
2026-05-05 14:05:45 -07:00
5df33b0f41 feat(benchmark): wire EmbeddingKNNAdapter into MODEL_REGISTRY as embed-knn-nomic 2026-05-05 12:43:48 -07:00
41584de5df fix(benchmark): guard empty exemplars, warn on malformed JSON in build_exemplars_from_jsonl 2026-05-05 12:41:46 -07:00
1d4c07e4a0 feat(benchmark): add build_exemplars_from_jsonl() for k-NN seed 2026-05-05 11:43:12 -07:00
e823b5e76d fix(classifier): majority-vote key, partial-load guard, sparse label test 2026-05-05 11:39:24 -07:00
88bc6bed67 feat(classifier): implement EmbeddingKNNAdapter.classify() with k-NN vote 2026-05-05 08:04:54 -07:00
4a64a6686d fix(classifier): atomic embed assignment, logging on orch failure, guard double load 2026-05-05 07:53:15 -07:00
f2f150b4fb feat(classifier): implement EmbeddingKNNAdapter.load() and unload() 2026-05-05 07:12:53 -07:00
72449561cf feat(classifier): add EmbeddingKNNAdapter skeleton and constructor tests 2026-05-05 06:08:21 -07:00
c177fb1628 fix(classifier): quality fixes for DEFAULT_EXEMPLARS — remove forward __all__ entry, tighten tests, fix survey exemplar 2026-05-04 20:03:18 -07:00
3be5055e31 feat(classifier): add DEFAULT_EXEMPLARS for embedding k-NN fallback 2026-05-04 17:44:44 -07:00
78b64d007d feat(classifier): add _cosine() helper for embedding similarity 2026-05-04 17:41:45 -07:00
bce932461a feat: plans benchmark harness — model scoring for CF planning prompts
Adds benchmark_plans.py script, plans_bench API router, PlansBenchTab Vue
component, and registers /api/plans-bench in api.py. Also extends models
registry (cf-text catalog integration), cforch client, LlmEvalTab, and
ModelsView with cf-orch fleet support. Wires Planning mode into BenchmarkView.
2026-05-02 23:36:04 -07:00
e11db5ccd9 fix: align train job/results API envelope, config_json key, progress SSE, dashboard model_key
- GET /api/train/jobs now returns {"jobs":[...]} instead of bare array
- GET /api/train/results now returns {"results":[...]} instead of bare array
- POST /api/train/jobs body key renamed config -> config_json to match Pydantic model
- SSE log handler now handles 'progress' event type (backend never emits 'log')
- Dashboard _get_active_jobs() adds model_key to SELECT and return dict
- corrections.py docstring updated: both /api/corrections and /api/sft prefixes noted
- test_train.py assertions updated to unwrap new envelope shapes
2026-05-02 21:22:18 -07:00
13d1a394d5 fix: add loading state, widen nullable types, add API response guard in TrainResultsView 2026-05-02 20:49:34 -07:00