post

We Have the Model, Why do We Need You?

May 17, 2026

I was thinking about how to properly communicate the value of

Software Engineering as a discipline, in the era of rapid LLM code generation
Me as an individual, basically “why should you hire me specifically?”

In The Before Times, the answer was vaguely “I solve your business problem using code”.

This value was pretty clear to Business People, in that they would want a thing to do a thing and had two options.

duct tape some SaaS functionality together using tools like fivetran, zapier, airtable, notion, sheets/excel, and Email (their favorite!)
make a request that lands (usually) at the bottom of the Backlog behind all of the product features that Product People faught to be built first.

The duct taping rarely resulted in exactly what they wanted to accomplish, since you can only customize this class of SaaS so much. They did have the option for something bespoke to be built, but its at the whim of the Backlog miners having enough resources to reach that part of the mine in time. Obviously this sucks, because they don’t just want it, they want it now!

LLMs come to the scene, then Agent harnesses, and eventually we get The Inflection where they become net-useful, which leads to The Rift where some folks think Models are all we need now and Software Engineering is a dead field.

Now, pretty much anyone with the approval to burn some can spit out Something That Works in anywhere from a few minutes to a few days. If that’s the case, why do they still need software engineers?

Wide Context

Maybe this changes, maybe this is durable, but currently LLMs are kind of like race horses with blinders on. As a horse, they move very fast towards their goal. This can be a boon for productivity, and for tasks that are Not Quite Scriptable (like refactoring beyond name changes and relocations) its a whole new unlock entirely. However, given the blinders,they unaware of the totatlity of their surroundings.

The Agent’s awareness is limited to these categories:

What’s provided to it by the Operator (prompting)
What’s discoverable (tool usage) and what it knows it should discover (context engineering)

More often than not, things fit into neither of those categories until some failure condition is reached and the Operator makes updates. In the Agent-Operator copilot use-case, the Operator is an eagle keeping an eye on the action below, making sure that the horse is not missing a watering hole it desperately needs to keep going.

Holding a large amount of context at the same time is helpful to

Avoid pitfalls, mistakes already made before
Match problems to solutions (often laterally across domains)
Integrate disparate systems that are not directly connected

Aesthetic Coherence

You can tell an Agent “only use functions with no side effects” or “encapsulate all dependencies with object inheritence” and it’ll do that. Why will it do that? Well, because you told it to. Models are the modern Golem and will do what you say without much pushback (unless you tell it to pushback of course, which may or may not be pushback technically).

If you do not tell it to follow any design pattern in particular, it will often just spit out an incoherent hodge-podge. This is largely because Models are All of the Things, and the threshold between concepts is not always categorical, but rather a gradient. Which things are appropriate for which scenarios is basically just a matter of Taste.

Taste is hard to nail down, but in the context of Software Engineering I would say it’s mostly about values and how you align them to the problem at hand. Sean Goedecke covers something similar in his post on the topic [1]. My take is that its a matter of Taste only when its a choice of subjectivity.

Here’s an example of two options that are both justifiably chosen depending on the scenario

Tall Flat Procedures:
- What: these mix control flow with lots of imperative code, minimizing single-caller functions, instead having everything collected into one place.
- Why: your brain does not have to context switch in and out of functions to piece together the totatlity of whats happening. No implementation details are hidden, avoiding incorrect assumptions of how data is being transformed.
- However: you have to “structure” with comment delineated sections, and glue code that doesnt add much value waters down your business logic.
Declarative Folding:
- What: this look like a dense outer function that is mostly control flow, with imperative code nested into single purpose named functions.
- Why: it “folds” or compresses a group of transformations into a verb-noun like fetchWebpage. Inside that function is all sorts of things, retrying, timeouts, TLS handling, but often the caller does not care about this as long as they get the data transformation of URL -> HTML they expect.
- However: It’s not always obvious what can go wrong unless you dive in anyway to read the transformations. In order to get the full picture, you have to assemble a graph in your mind of the operations, context switching in and out of each level. Variable mutations may be more verbose since they have to be defined in the input and output for each function.

So Taste is less “which one is better?” but “when is which one better?”, and thats a question answered by a person’s entire career of experience. Humans tend to apply these based partially on principles they form, or at least discrete heuristics. When you ask a Model to code for you, it will choose purely based on some probablistic relevance to whatever code within the context window. Unless you consistently steer it towards the direction you want, you can expect the Model to drift from the larger patterns laid out towards its geometric mean.

Some of the most successful large scale software projects like the Linux kernel and the Python language are/were guided by a central visionary. It’s not so much that the decisions made by those figures were the best objectively (accuracy), but the consistency of the decisions (precision) allow for the project to grow in scale while staying coherent.

When I was more junior I heard some senior engineers say something akin to “It’s more important for new work to be consistant with old work than to be good on its own”. I think there is some truth to that. Having a narrow surface area of harmonious design concepts and a understandable architecture is what Good actually looks like. In the era where we increasingly look at code by squinting, I think having focus on the bigger picture is probably the best place for human value to be realized.

Citations

What is "good taste" in software engineering?Sean GoedeckeSeptember 28, 2025
Goedecke frames engineering taste as choosing the engineering values that fit the project, which is the nearby argument here about coherent design judgement.

Glossary

Model

I just get sick of saying L-L-M all the time.

Agent

LLM (decision) + Context (direction) + Tools (action). Easy-as.

The Before Times

Prior to the proliferation of LLMs, the software world had very different unit economics, bottlenecks, and operating procedures. There was always a backlog of requests that grew faster that you could chew through them. Humans wrote documentation only for other humans. Internal tooling was rarely bespoke and was delegated to awkwardly duct-taping together pieces of different SaaS vendors. Everything was done by hand, even the things we didn’t want to do by hand. If you wanted to automate some of that hand-doing, the automation had to be done by hand.

The Rift

Now that code is cheap to generate and anyone can do it, there exists a disagreement as to whether or not Software Engineering is dead. Those convinced in death are the vested interests (LLM vendors, executives) those who are not are mostly the aggrieved party (namely software engineers). The proponents often misunderstand what Software Engineering is (it’s not really writing code), and tend to have the least experience building on or with LLMs at scale. The Rift is not just wether or not we are there yet, but whether we will ever be there (or should even want to be there).

The Inflection

It’s generally agreed upon by Software Engineers that sometime around November 2025, coinciding with Anthropic’s release of the Opus/Sonnet 4.5 models, LLMs crossed the rubicons of “useful more often than not” and “net saves time doing work”. Depending on the task, the gains may be minimal (logic-dense and novel code iteration) or massive (perusing and searching through communication/documentation systems).

Software Engineering

The act of:

theorizing a model of a business and technical domain
abstracting that theory into automations
updating those automations congruous with external feedback

All while maintaining or improving system coherence. Generally speaking, writing code is the lowest leverage task in the stack.

(a.k.a. SMTP) Love it or Hate it, it’s still kickin. Yes it could have been X.400 times better but its ubiquitous and an open protocol. You can use it for all sorts of stuff, like being notified that your new LLM model has broken out of its sandbox while you eat a sandwich in a park.

I think we would enjoy email more if we each had a private address we only used to communicate with other humans and not with services. It’s like having a secret P.O. Box that doesn’t get used adversarily to delivery spam ads.

Business People

Catch-all term for folks in a software company only interested in the What and completely unbothered by the How. “I don’t care how, just get it done by EOW” is the motto.

Sometimes this is positive because it provides big-picture focus where Software Engineers can get a little too fixated on small details. This version of the above phrase translates to “you have my support to do whatever you find is appropriate to meet our goals”. This maximizes agency within the bounds of role-appropriateness, we like this.

Sometimes this is negative because it serves to limit the delivery surface area to the lowest quality outcome, reagardless of the complications it brings. This version translates to “I’m not interested in your input on why this will not end up well, do it anyway”. This minimizes agency and disregards the value of Software Engineering beyond merely “outputting code”.

Something That Works

What you have when you limit the success criteria to “I get what I want most of the time”. This is the cousin to “this works on my machine”. This is mostly an illusion, because if you are building a product, you certainly are not the only user. Even if you were the only user, what you want is likely to change over time.

The Gap between this state and a real deliverable is

Durability (is the current functionality reliable?)
Extensibility (can I chage this without redesigning it?)
Comprehensability (do you understand this, or is it a black box?)

Golem

Models give you what you ask for, not what you want. It’s important to remember that you do not have the same understanding and context as the Model.

Models do not have the same ability to abstract and infer as humans do. Learning how to communicate with Models effectively is thinking deeply about what you want, and clearly specifying it without making assumptions.

Saying “lets get started” could mean let’s start planning the next actions, lets await instructions on what to do, or lets go ahead and just make up a direction out of thin air and implement it immediately, all things I have seen coding Agents do in response to that phrase.

Here’s a humurous example: even when you think you are being clear, there’s always room to misinterpret.

Wife says to her programmer husband, “Go to the store and buy a loaf of bread. If they have eggs, buy a dozen.” Husband returns with 12 loaves of bread.

All of the Things

Flagship Models are pre-trained on trillions of tokens and contain countlesss examples of every coding pattern you are likely to have learned. They contain libraries on libraries of prose in more styles than someone like me could ever enumerate. All of this is compressed (in Deepseek V4 Pro’s case) into the space of 63 4k HDR movies. You and I, being human, have opinions often based proportionately mostly out of ignorance. I can’t say do X if I don’t know that X exists, so I’m stuck fighting for or against Y instead (shout out Y! u a real one). Since models have all of the patterns and all of the opinions in their latent space, they can be any one of those things with just a little nudge.

Gap

The difference between what you need and what you have. Gaps identify a need for Reflection. In the case of using LLMs, Gaps are a great chance for you to theorize what is going wrong (a core skill of Software Engineering), and encode your learing in a way that will steer the LLM. If you do this right, you will have improved Convergence and will be much better off the next time the same situation arrives.

Convergence

The distance between what you know and what an Agent can reliably act on. Through your (or your users’) interactions with an Agent, you should aim to reduce the distance between “what I would have done” and “what it did”. When using Agents as a primary control plane (like code assistance) this is critical, because the LLM is a static model. It does not ever update or learn by interacting with you in the way that you learn from interacting with it. In order to continue to have the LLM “follow” you as you learn what good and bad outcomes look like, you have to provide that signal to the LLM in a durable and accessible way. Until something like continuous fine tuning or self adaptive world models exist, this is about the best you’ll get.