You've probably had the experience by now. You type something into an AI tool a brief, a concept direction, a request to explain a design principle to a client and what comes back is genuinely impressive. Articulate. Structured. Almost exactly what you needed. You feel a little thrill. This thing is incredible.
Then, a day later, you ask it something simpler. A factual question. A straightforward request. And it gives you an answer that's completely, confidently wrong. Not vague wrong. Detailed, fluent, and wrong. And you sit there thinking: what is going on with this thing?
Here's the honest answer: nothing went wrong. That swing from brilliant to baffling isn't a bug or a bad day. It's the nature of the medium. And once you understand why it happens, your whole relationship with AI stops being a guessing game.
Every tool you know works the same way
Think about the software you use every day. Figma. Photoshop. A spreadsheet. They're all built on the same fundamental logic: you give an instruction, the software executes it. Duplicate a layer, the layer duplicates. Apply a mask, the mask applies. Same input, same output, every time.
This is called deterministic behaviour the outcome is determined entirely by your instruction. The predictability is the whole point. These tools obey.
Generative AI does something completely different. It doesn't obey. It interprets. When you type a prompt, the model isn't executing a command. It's doing something closer to a very sophisticated guess: predicting what response would most naturally follow from what you've written, based on patterns absorbed from an enormous amount of human-generated text and data. Every word it produces is the result of asking, in effect, what comes next?
That one difference execution vs. prediction; is the thing worth holding onto. It explains everything great about these tools, and everything frustrating.
What "generative" actually means
Traditional AI — the kind that powers spam filters, recommendation engines, fraud detection is mostly about classification and prediction. It works within a defined range of possible answers.
Generative AI is different: it creates new content rather than choosing from existing options. It doesn't retrieve a stored answer. It builds a response, token by token, that didn't exist before you asked.
The most useful plain-language description is this: generative AI is extraordinarily sophisticated autocomplete. Think of the predictive text above your phone's keyboard — watching what you type and suggesting the next word. Now imagine that same principle, but trained on terabytes of human-produced text, predicting not just the next word but word after word until a whole paragraph exists.
That's the mechanism. It's not thinking. It's not understanding. It's completing with a level of sophistication that produces outputs that can genuinely feel like thinking. This is why it can draft a design brief, generate a UI concept, or explain visual hierarchy, not because it understands those things the way you do, but because it has absorbed enough human examples of each to produce something that follows the pattern.
Why the same tool feels brilliant one day and unreliable the next
Here's the part most people don't explain, and it's the key to everything. When the model predicts the next word, it doesn't just pick one answer. It assigns a probability to every possible option and then selects from that distribution. This variational quality is baked into how the system works it's called being probabilistic, and it's not a flaw. It's the design.
This is why the same prompt can produce meaningfully different results on different runs. There's no inconsistency, no bug, no bad luck. Every output is one draw from a wide distribution of possible outputs. You're not seeing errors you're seeing probability in action.
Once you internalise that, the question changes. You stop asking how do I make it consistent? and start asking: how do I shape the range of likely outputs so that whatever it picks lands close to what I need? That shift from trying to control the output to learning to shape the space is one of the most practical mental moves you can make as a designer working with these tools.
What it's predicting from, and why that matters
When the model produces a response, it's drawing on patterns absorbed during training: vast amounts of human-produced text, images, code, and data. The result is a compressed statistical understanding of how language works, what ideas relate to what other ideas, and what kinds of responses tend to follow what kinds of prompts. Not a filing cabinet of facts. A dense web of relationships between concepts and words.
The magic: when your prompt touches well-represented patterns creative briefs, UX principles, product copy, design critiques the model is working in rich territory. The outputs can be genuinely impressive.
The danger: the model has no ability to distinguish between what is true and what is merely statistically likely to appear in a response like this. If a confident-sounding statistic is the most probable next token given your prompt, the model produces it regardless of whether that statistic exists in the real world.
This is what researchers call hallucination: outputs that sound completely authoritative and are entirely made up. The professional practice that follows is simple: use AI freely to generate, explore, and structure but verify any factual claim before it leaves your hands. Names, numbers, dates, citations, and technical specifications are the highest-risk categories. The more confident the output sounds, the more worth checking.
The collaborator with no idea what it's doing
Here's a question worth sitting with: are you working with a tool, or a collaborator? The honest answer is neither, exactly but collaborator is the more dangerous misconception.
The AI tools you use every day are what researchers call narrow AI: systems that are extraordinarily capable within a specific domain, but with no awareness, goals, or understanding outside of it. There's no curiosity. No agenda. No knowledge that you exist. When your session ends, nothing persists.
Understanding this doesn't make these tools less useful. It makes them more legible. You're working with an extraordinarily capable pattern-completion engine that has no stake in the outcome and no ability to tell you when it's wrong. That's actually clarifying the judgment, the taste, the knowledge of your client and their users, all of that remains yours. Entirely yours.
The shape of the work: fast to good, slow to great
AI gets to good very fast. A first draft of a brief, an initial concept direction, a summary of a research document the first 80% of almost any task arrives quickly, and with enough quality to be genuinely useful. This is where the magic feeling comes from.
Then it gets hard. The last 20% the specific, contextual, opinionated finishing that turns something good into something right for this project, this client, this moment is slow, iterative, and human-led. "Good" is statistically central, and "great" is idiosyncratic. There's no pattern in the training data for your client's exact brand voice at this point in their company's story. That's yours to bring.
This won't work when you're looking for a single prompt to deliver a finished, polished result. What actually works is treating AI as a generator of high-quality raw material and yourself as the person who knows what to do with it.
Think about the last piece of design work you were proud of. The thing that felt right in a way you could feel but couldn't fully explain.
What made it good? How much of that was generative (making something exist) versus evaluative (knowing it was right once it did)?
AI is getting faster at the first part. The second part is entirely you.





