Evidence Log

Changelog

Filed updates from the SlopCop desk: scan plumbing, private repo safeguards, and public case report improvements.

v0.22

Provider expansion, previous-report review, and safer scan outcomes

  • Public scans now support provider-aware GitLab and Bitbucket repositories alongside GitHub, including improved provider identity handling, public repository metadata, benchmark examples, and Most Wanted social previews.
  • New scans can review previous reports before generating fresh findings, giving synthesis and lane processing more context when repositories have already been inspected.
  • Partial and no-score scan outcomes are now handled more explicitly with review metrics, stable public failure payloads for Brokk executor transport errors, and clearer report canonicalization for incomplete evidence.
  • Scan intake and report flows now include stronger privacy and retention hints, repo-specific metadata in closed reports, updated default Gemini model identifiers, and new footer and Discord support affordances.
  • Hotfix: live case routing now classifies infrastructure-heavy repositories more reliably and keeps static Records and git-hotspot prioritization identities off the specialist board.

v0.21

Report navigation, specialist routing, and Night Shift upgrades

  • Closed reports gained a table of contents, improved Markdown rendering, repository-type summary cards, clearer score-band verdicts, and better report UI affordances for navigating larger case files.
  • Specialist routing and synthesis now account for repository type, lane-local evidence, live specialist progress, and broader infrastructure/GitOps classification so reports stay better scoped to the code under review.
  • Night Shift now includes powerups, music, sound effects, score submission, leaderboards, and admin leaderboard controls with privacy coverage for the new gameplay telemetry.
  • Production diagnostics and deployment readiness were hardened with buffered Brokk progress logs, safer upstream proxy failure handling, frontend-only health checks, and corrected frontend container dependency installs.

v0.20

Reader feedback, safer failures, and report contract hardening

  • Completed scans now support reader disagreement feedback end to end, including admin review tooling, private-scan redaction, and updated privacy and security disclosures for retained notes.
  • Admin access and public case handling were tightened with safer redirect return targets, dossier links back to scanned public GitHub repositories, and case UI copy that hides raw structured failure payloads while preserving plain-text errors.
  • Brokk scan recovery and benchmark report contracts were hardened so partial specialist lanes can recover from terminal-tool failures and downstream provenance and dashboard semantics stay consistent.

v0.19

Low-sniff digest delivery and stricter AI-slop calibration

  • Closed reports now support low-sniff digest output end to end, including client parsing, server persistence, scan flow handling, and email delivery.
  • Closed-report digest coverage was expanded so low-sniff cases are stored and surfaced more consistently across the case lifecycle.
  • Specialist and synthesis prompts now use tighter shared AI-slop calibration guidance with more conservative, evidence-first framing and fewer duplicated prompt rules.

v0.18

Closed-case sharing and specialist routing hardening

  • Specialist hotspot routing now relies on lane-local evidence with safer dominant-cluster handling for monorepos and tooling-heavy trees, so prompt guidance stays aligned with each lane's actual seed packet.
  • Closed case reports can now be copied or saved directly as Markdown case files, and published case sharing to X uses corrected messaging and link handling.
  • Closed-case specialist tabs now keep evidence-bearing fallback memos visible, hide only true placeholder rows, and recover invalid tab selections more cleanly.
  • Brokk analyzer launches now default to an I/O concurrency cap of 24, with a documented `SLOPCOP_BROKK_IO_MAX_CONCURRENCY` override for constrained hosts.

v0.17.1

Share-to-socials hotfix

  • Published case files now share more cleanly to X with corrected message formatting and link handling.
  • Closed reports can now be copied or saved as Markdown case files directly from the case footer for easier off-platform sharing and archival.

v0.17

Specialist hotspot routing hardening

  • Specialist prompts now receive dominant hotspot routing hints only when their own scoped seed packet supports a clear cluster, so lane-local evidence can diverge safely.
  • Dominant-cluster grouping now handles monorepo-style package layouts more safely and avoids treating tooling-heavy implementation trees as wrapper noise by default.
  • Single-cluster seed responses now keep their routing hint instead of silently falling back to a flat seed list when all evidence already agrees on one subtree.
  • Regression coverage now exercises lane-local routing, monorepo clustering, tooling-heavy repos, and the single-cluster path so specialist hotspot guidance is harder to misroute.

v0.16

Lane durability and benchmark hardening

  • Specialist lane processing now persists raw outputs and normalization artifacts separately, preserves failed-lane evidence more reliably, and recovers retries from durable per-agent state.
  • Analyzer warmup retries now stop cleanly on cancellation, reacquire the restarted static-analysis client correctly, and keep live debug-log access available during active scans.
  • AI-slop judgment now keeps confidence distinct from maintainability risk with stricter prompt guidance, validation, parsing, and case dashboard rendering.
  • Benchmark runs reuse cached repository sources more safely with validated app-data caches, stronger workspace materialization, and clearer seed-helper commands for local operators.

v0.15

Report-only hardening and benchmark forensics

  • Synthesis now runs in Brokk report-only mode with stronger prompt guardrails, scoped non-finding handling, and clearer structured evidence requirements in final reports.
  • Timed-out Brokk jobs drain more safely, model fallback handling is sturdier, and scan reports preserve missing-file and fallback cases more cleanly in production.
  • Admins gained richer report dashboards, vertical model comparisons, earlier repository stats, and clearer scorecard handling for incomplete or unknown evidence.
  • Benchmark seeding is more reproducible and the sniff-test rubric now scores report quality, repository risk, and confidence with broader reliability coverage.
  • The deploy image and runtime inputs were tightened by pinning the Brokk Docker build commit, fixing backend release metadata, and upgrading npm dependencies for security.

v0.14

Benchmarking and operator telemetry

  • Specialist scans now use timeout budgets with drain handling so timed-out Brokk jobs are capped and cleaned up more predictably.
  • Admins gained model telemetry, executor memory/runtime metrics, richer diagnostics, and detailed scan report views for production triage.
  • Benchmark tooling can seed isolated model/reasoning matrices, and seeded queues now randomize repo, model, reasoning, and repetition distribution before draining.
  • Per-scan Brokk debug logs, release-version metadata, and non-blocking public scan sniff tests make production cases easier to audit after the fact.
  • Brokk-backed analysis now runs through the read-only executor sandbox, with launch argument fixes for the pinned deploy executor.
  • Partial specialist timeout recovery preserves usable lane evidence and improves incomplete scan visibility in the admin dashboard.