Executive Summary

The engagement lead completed a targeted code quality and maintainability audit of sampled architectural hotspots within the repository. The audit revealed a high maintainability risk characterized by systemic evolutionary technical debt across the inspected modules. This includes massive God objects, deeply nested control flow, exact structural duplication across module boundaries, and overzealous exception handling. However, the evidence strongly points to organic, human-driven development rather than machine-generated code, resulting in a low AI-slop confidence rating. The structural decay within these hotspots stems from intense domain complexity, rapid multi-language feature integration, and highly concurrent operations rather than low-judgment automated pattern generation.

Background

The repository contains the source code for the Brokk application, encompassing core orchestration logic, AST-based static analysis components, and CLI interfaces. The audit scope targeted the primary Java source directories across the app, brokk-core, and brokk-shared subprojects to assess structural integrity, testing discipline, abstraction utilization, and error-handling maturity within bounded samples.

Methodology

The auditor evaluated maintainability signals via static analysis tools targeting cognitive complexity, size and sprawl, structural duplication, error-handling smells, dead abstractions, test-signal degradation, and comment-intent density. Candidate findings were filtered through agent-led triage and validated via targeted source review.

The investigation was constrained by tool budgets (capped at 5 tool calls per specialist lane) and static analyzer memory limits (capping candidate symbols to 150). Consequently, the findings represent a bounded sample of architectural hotspots rather than a complete repository inventory.

The overall quality of the audited codebase sample is reflected in the scorecard below.

Findings

Size, Sprawl, and Cognitive Complexity

The auditor observed systemic size sprawl and control-flow density across multiple core architectural boundaries. Several distinct core classes function as massive aggregation points or monolithic dispatchers. While these files perform highly integrated domain tasks—such as AST parsing and CLI routing—their current size and cyclomatic footprint present a severe maintainability risk.

File hotspot distribution

brokk-shared/src/main/java/ai/brokk/analyzer/TreeSitterAnalyzer.java

LOC 5519 · 100% · Measured

app/src/main/java/ai/brokk/ContextManager.java

LOC 2983 · 100% · Measured

app/src/main/java/ai/brokk/acp/BrokkAcpAgent.java

LOC 2962 · 100% · Measured

app/src/main/java/ai/brokk/cli/BprCli.java

Cyclomatic 178 · 100% · Measured

app/src/main/java/ai/brokk/EditBlock.java

Cognitive 65 · 100% · Measured

app/src/main/java/ai/brokk/AnalyzerWrapper.java

Cognitive 62 · 100% · Measured

Structural Duplication

The engagement lead identified exact code duplication spanning module boundaries and sibling implementations within the audited scope. Rather than relying on common libraries or base class inheritance, heavy utility methods and boilerplate AST routines were copied verbatim across the system.

File list with notes

app/src/main/java/ai/brokk/tools/SearchTools.java

This massive 2400+ LOC utility class exhibits 100% token similarity and is completely duplicated across the app and brokk-core modules instead of existing in a shared library.

brokk-shared/src/main/java/ai/brokk/analyzer/JavaAnalyzer.java

Highly detailed AST analysis methods for smell detection are copy-pasted across at least eight separate language analyzer implementations instead of utilizing inheritance from TreeSitterAnalyzer.

Failure Handling, Test Signal, and Comment Intent

The audit confirmed brittle failure handling patterns, misleading test assertions, and passive warning commentary across independent core files. Exception handling is frequently misused for standard control flow, and overly broad exception catches mask critical JVM failures within the inspected sample.

File list with notes

app/src/main/java/ai/brokk/AnalyzerWrapper.java

Overzealous catching of Throwable masks severe system errors and prevents graceful crashing under conditions like OutOfMemoryError.

app/src/main/java/ai/brokk/EditBlock.java

Exceptions are thrown and swallowed under empty catch blocks to handle basic text replacement verification, weaponizing exceptions for flow control.

app/src/test/java/ai/brokk/ContextCompressionTest.java

A tautological assertion directly compares an input object to itself, entirely bypassing the behavioral validation of the compression routine.

app/src/main/java/ai/brokk/agents/ArchitectAgent.java

Frustration-driven inline comments flag tight coupling and bug workarounds, masking deep systemic issues behind emotional venting rather than actionable tracking tickets.

Validated Non-Findings

The auditor verified that the sampled core abstractions remain highly utilized and well-integrated. A dead-code audit of selected core orchestration classes and decoupling layers confirmed that the inspected interfaces are actively wired across multiple production views and test harnesses. No high-severity dead code residue or redundant one-call abstractions were identified within the inspected boundary.

Actively Wired Core Interfaces

app/src/main/java/ai/brokk/ContextManager.java

app/src/main/java/ai/brokk/AbstractService.java

app/src/main/java/ai/brokk/SessionManager.java

Recommendations

Use these SlopCop recommendations to create a concrete implementation plan for reducing code slop in this repository.

Turn the checklist into an ordered task list. Preserve the intent of each recommendation, identify the files or subsystems to inspect first, and call out tests or verification steps that should be run after the changes.

Recommendations:

- [ ] **Extract Edit Pre-Flight Logic**: Decompose `app/src/main/java/ai/brokk/EditBlock.java` by moving the block pre-resolution and file validation loop into a package-private helper, reducing nested cognitive complexity.
- [ ] **Simplify Analyzer Recovery Chains**: Refactor `app/src/main/java/ai/brokk/AnalyzerWrapper.java` to separate language delegate loading from targeted state repair, eliminating deep exception-handling loops.
- [ ] **Consolidate Search Tools**: Migrate the identically duplicated `SearchTools.java` implementations from the `app` and `brokk-core` modules into a unified shared module library.
- [ ] **Promote AST Boilerplate**: Extract the copied `findExceptionHandlingSmells` and `findTestAssertionSmells` methods from individual language analyzers up into the `TreeSitterAnalyzer` base class.
- [ ] **Refactor Throwable Catches**: Replace instances of catching `Throwable` in `app/src/main/java/ai/brokk/AnalyzerWrapper.java` and `app/src/main/java/ai/brokk/Llm.java` with explicit catches for `Exception` to avoid masking JVM-level `Error` states.
- [ ] **Eliminate Control-Flow Exceptions**: Refactor `app/src/main/java/ai/brokk/EditBlock.java` to use boolean return states for text presence verification rather than throwing and swallowing exceptions in empty catch blocks.
- [ ] **Fix Tautological Tests**: Correct `app/src/test/java/ai/brokk/ContextCompressionTest.java` to assert the output of the compression invocation against an expected value, rather than comparing the object to itself.

The following concrete steps are recommended to remediate the identified technical debt and structural risk within the cited hotspots:

Extract Edit Pre-Flight Logic: Decompose app/src/main/java/ai/brokk/EditBlock.java by moving the block pre-resolution and file validation loop into a package-private helper, reducing nested cognitive complexity.
Simplify Analyzer Recovery Chains: Refactor app/src/main/java/ai/brokk/AnalyzerWrapper.java to separate language delegate loading from targeted state repair, eliminating deep exception-handling loops.
Consolidate Search Tools: Migrate the identically duplicated SearchTools.java implementations from the app and brokk-core modules into a unified shared module library.
Promote AST Boilerplate: Extract the copied findExceptionHandlingSmells and findTestAssertionSmells methods from individual language analyzers up into the TreeSitterAnalyzer base class.
Refactor Throwable Catches: Replace instances of catching Throwable in app/src/main/java/ai/brokk/AnalyzerWrapper.java and app/src/main/java/ai/brokk/Llm.java with explicit catches for Exception to avoid masking JVM-level Error states.
Eliminate Control-Flow Exceptions: Refactor app/src/main/java/ai/brokk/EditBlock.java to use boolean return states for text presence verification rather than throwing and swallowing exceptions in empty catch blocks.
Fix Tautological Tests: Correct app/src/test/java/ai/brokk/ContextCompressionTest.java to assert the output of the compression invocation against an expected value, rather than comparing the object to itself.

Specialist Evidence Synthesis

The complete collection of parsed outputs and verified findings from the specialist agent lanes is detailed below.

Specialist lane summary

Cognitive Complexity Specialist

code-quality-cognitive-complexity

clean

EditBlock.apply (cognitive complexity 65, cyclomatic complexity 35), AnalyzerWrapper.loadOrCreateAnalyzer (cognitive complexity 62, cyclomatic complexity 47), and Llm.doSingleStreamingCallInternal (cognitive complexity 47, cyclomatic complexity 33) form the core high-density execution centers. No low-judgment AI-generated slop was detected; findings are high-cohesion, organic human technical debt stemming from system coordination, multi-threaded callbacks, and AST mismatch repair logic.

Deeply nested search/replace block resolution and application with high cognitive complexity (65) and cyclomatic complexity (35).
Highly nested multi-language analyzer coordination and state recovery with high cognitive complexity (62) and cyclomatic complexity (47).
Highly branched streaming state machine and concurrency synchronizer with high cognitive complexity (47) and cyclomatic complexity (33).

Limits: The budget of 5 tool calls restricted further scanning of other files, but was highly sufficient to fully profile the core architectural hotspots.

Size & Sprawl Specialist

code-quality-size-sprawl

clean

Five central structural god objects have been identified and measured: TreeSitterAnalyzer (5,519 LOC, 371 functions, 302 members), ContextManager (2,983 LOC, 289 functions), BrokkAcpAgent (2,962 LOC, 183 functions), BprCli.call() (712 LOC, CC 178), and BrokkCoreMcpServer.toolSpecifications (665 LOC, CC 62). All files display significant size overgrowth, but their shapes represent dense domain integration and organic feature accumulation (e.g. AST parsing, multi-pass editing, complex CLI routing, websocket-auth flows) rather than low-judgment AI generation.

The TreeSitterAnalyzer class spans 5,519 lines of code, holding 371 functions and 302 direct members, acting as a massive plumbing and representation god object for code analysis.
ContextManager is an oversized god class with 2,983 lines, 289 functions, and 205 direct members, consolidating state, LLM parameters, prompt crafting, and listener setups.
The BrokkAcpAgent class spans 2,962 lines, 183 functions, and 191 direct members, violating single-responsibility principles by mixing auth flow, session setup, and global preference applications.

Limits: The step budget capped the search expansion after 7 tool calls, focusing on the primary hotspot cluster files in app/src and top git churn files.; Verified code structures directly via workspace editable fragments instead of additional read calls.

Structural Duplication Specialist

code-quality-structural-duplication

clean

Two major structural duplication centers are confirmed: (1) The massive SearchTools.java (over 2,400 LOC) is 100% duplicated between app and brokk-core modules instead of using a common library model. (2) Highly detailed AST analysis methods (findExceptionHandlingSmells, findTestAssertionSmells, buildCloneAstSignature) are 100% duplicated across at least eight different language analyzers (such as JavaAnalyzer, PythonAnalyzer, CppAnalyzer, RustAnalyzer, etc.) rather than being inherited from the base TreeSitterAnalyzer or default interface methods.

The massive SearchTools class (over 2400 LOC) is 100% duplicated across app/src/main/java/ai/brokk/tools/SearchTools.java and brokk-core/src/main/java/ai/brokk/tools/SearchTools.java, copying dozens of highly complex utility methods.
Boilerplate AST smell detection and similarity refinement methods are copy-pasted across at least eight language analyzer implementations (Java, Python, C++, Rust, Go, C#, PHP, Scala) instead of being consolidated in their common superclass, TreeSitterAnalyzer.

Limits: budget limit of 5 tool calls prevented wider systemic analysis across remaining subprojects/modules.

Error Handling Specialist

code-quality-error-handling

clean

Overzealous catching of Throwable is a common defensive pattern across core services, masking critical non-Exception JVM crashes. Standard flow control in EditBlock.java relies on throwing and swallowing matching exceptions under empty blocks. Background systems (like SessionManager.java's ledger parsing) implement silent log-and-continue loops, which isolate failures but can obscure system and metadata degradation.

Overzealous catching of Throwable instead of specific exceptions masks system errors and prevents graceful crashing under serious JVM conditions (e.g. OutOfMemoryError).
Using exceptions for control flow with empty or comment-only catch blocks in EditBlock.java leads to complex, hard-to-maintain code logic.
Log-and-continue behavior on I/O operations masks silent failures, which can obscure critical bugs or data integrity issues.

Limits: Analyzed the top 10 core service files in app/src directory using the exception analyzer tool, then used target method sources for deep-dive. Budget capped further general search at 5 tool calls.

Dead Code & Abstraction Specialist

code-quality-dead-code

clean

Conducted program-wide static analysis of core orchestrator classes and API execution routers using reportDeadCodeAndUnusedAbstractionSmells. Checked decoupling layers including IAnalyzerWrapper and BlitzForge.Listener. No high-severity dead code declarations or redundant one-call abstractions were confirmed within the inspected scopes. All audited interfaces and listeners are actively and heavily wired across production view panels, core utilities, and test suites, reflecting healthy human evolutionary design rather than AI-generated slop.

Limits: Evaluated 150 candidate symbols across core and executor files; capped symbol reference evaluation at 50 file references due to static analyzer memory constraints.

Test Signal Specialist

code-quality-test-signal

clean

Audited test files, focusing on core orchestrator test classes. Programmatically and manually reviewed assertion patterns to identify false-confidence loops. Confirmed a high-confidence tautology in app/src/test/java/ai/brokk/ContextCompressionTest.java where alreadyCompressed is asserted equal to itself instead of verifying actual compression outputs. Broader test patterns demonstrate solid human-written validation structures with robust assertions, rather than low-judgment automated testing pattern repetition.

Tautological assertion in ContextCompressionTest asserts that an object is equal to itself, creating false confidence with no behavioral validation.

Limits: Budget limit of 5-6 tool calls constrained broader verification across all 565 test files.

Comment Intent Specialist

code-quality-comment-intent

clean

Analyzed comment intent across primary Java components. Discovered inline commentary detailing technical frustration, loose workaround logic, and deep system-level cyclic dependencies. Confirmed frustration-driven comments in ArchitectAgent.java, passive warnings masking legacy coupling in Llm.java, and localized bug overrides. Overall comment density patterns show organic human commentary with informal technical trade-offs and warnings, without any machine-generated slop signatures.

Frustration-driven inline comments (e.g., 'FIXME this should not be fucking necessary') hide real architectural and exception handling issues under emotional venting.
Inline comments documenting hard-to-refactor areas (e.g., 'this should be final but disentangling from ContextManager is difficult') serve as unsafe passive warning signs rather than driving active deprecation or decoupling.
Comments describing workarounds for bugged behavior in other classes indicate a pattern of masking root causes instead of fixing upstream bugs.

Limits: Focused on main Java orchestration and agent codebase directories

Conclusion

The inspected sample exhibits high maintainability risk rooted in intense cognitive complexity, massive class sprawl, and localized structural duplication. However, the nature of these findings points definitively to evolutionary human development rather than AI-slop generation. The code coordinates complex concurrency, CLI routing, and multi-language AST parsing—all tasks that organically accrue defensive checks and heavy coupling over time. The engagement lead recommends prioritizing the extraction of shared utilities and base-class inheritance to immediately lower the duplication burden within the identified modules, followed by targeted decomposition of the highest-complexity orchestration loops.

Det. Knots

Det. Sprawl

Det. Echo

Det. Fallback

Det. Morgue

Det. Alibi

Det. Margins

Executive Summary

Background

Methodology

Findings

Size, Sprawl, and Cognitive Complexity

Structural Duplication

Failure Handling, Test Signal, and Comment Intent

Validated Non-Findings

Recommendations

Specialist Evidence Synthesis

Conclusion