Why LLMs Struggle to Build Software – A Technical Perspective

1. The Essence of Engineering: The Continuous Mental Model Loop

Skilled software developers operate within a tightly knit mental feedback loop:

  1. Envision Requirements – They internalize the project’s goals and constraints.
  2. Write Code – They translate that understanding into functioning implementations.
  3. Interpret Behavior – They test or analyze the code to see how it actually works.
  4. Refine Understanding or Implementation – They reconcile gap(s) between intention and result, adjusting code or requirement interpretation accordingly.

What distinguishes expert engineers is not mere coding ability—but their proficiency in sustaining parallel mental models: one reflecting intended behavior, another capturing actual outcomes—and adjusting both in a disciplined loop.

2. Where LLMs Falter: Context & Internal Modelling

Large Language Models (LLMs) excel at generating code and following instructions. They can integrate logging, produce documentation, and even propose tests or fixes. However, they fall short in sustaining coherent mental models, creating a fundamental mismatch with what engineering truly demands.

LLMs tend to assume their generated code is correct, even in the face of failing tests. They’re prone to:

  • Guessing whether to adjust the code or the tests—often without clear rationale.
  • Restarting from scratch rather than iteratively refining their work—losing all previous context and learning.

In contrast, an engineer reflects on failures, analyzes discrepancies, consults (or seeks help), and evolves understanding—never blindly discarding the accumulated context.

3. The Contextual Outages of Today’s Models

Several systematic limitations impede LLMs from overcoming these shortcomings:

  • Context Omission: They struggle to detect missing pieces in the problem context.
  • Recency Bias: Information introduced earlier in a session is often forgotten or deprioritized.
  • Hallucination: They may fabricate details or misunderstand the existing code logic.

These issues are not merely superficial. Without robust mental coherence, an LLM cannot meaningfully distinguish between design assumptions, implementation details, and emergent behaviors.

4. Can Future Models Overcome These Limitations?

Perhaps. But the answer doesn’t lie solely in expanding the context window. Humans manage mental load by temporarily stashing context, zooming in on current tasks, and then resurfacing the broader perspective. LLMs lack such an adaptive memory stack—cluttering them with irrelevant details, or leading them to drop critical context altogether.

To bridge this gap, future model architectures must incorporate richer, structured memory, sophisticated reasoning pathways, and flexible attention mechanisms—modeling not just code generation but context-aware reflection.

5. What Engineers Should Do (Today)

LLMs are not replacements—they are powerful augmented tools. Use them to:

  • Draft boilerplate code and documentation.
  • Generate initial implementations or sketches.
  • Synthesise requirements into clearer specifications.

But you are still responsible for verifying that generated code fits the intended behavior. Ensure tests are meaningful, challenge the assumptions made, and maintain the mental models necessary for true software correctness.

6. Final Thought: Collaboration, Not Automation

The future of development lies in synergy—humans steering context, logic, and intent; LLMs supplying rapid coding fluency where clarity already exists. Until models can reliably reason, maintain context, and reflect—as we do—human engineers remain indispensable at the helm.