Law 8: The "Good Enough" Model Law - Don't chase state-of-the-art perfection; chase user-centric utility.

Updated 2025-08-06T00:13:20.364957 • 21.0 min read

Law 8: The "Good Enough" Model Law - Don't chase state-of-the-art perfection; chase user-centric utility.

1. Introduction: The Lure of the Leaderboard

1.1 The Archetypal Challenge: The Kaggle Grandmasters

A startup, "TranslatePerfect," is founded by a team of Kaggle grandmasters and NLP PhDs. Their mission is to build the world's most accurate machine translation service for business documents. They download every publicly available parallel corpus and spend eighteen months in their lab, laser-focused on a single goal: beating the state-of-the-art (SOTA) BLEU score, a common academic benchmark for translation quality. They invent novel transformer architectures, fine-tune massive models, and after immense effort, they achieve their goal. Their model is, according to the benchmarks, the most accurate translation engine in the world.

They launch their product, targeting large multinational corporations. The response is lukewarm. Potential customers are impressed by the technical achievement, but they don't buy. When pressed for feedback, the customers reveal a series of practical, non-academic problems. The model is so large and computationally expensive that it's too slow for real-time use cases. Its API costs are 5x higher than the "less accurate" competitor. And while it excels at translating literary text from the training data, it struggles with the industry-specific jargon and acronyms found in their actual legal and technical documents. The team at TranslatePerfect had won the battle on the academic leaderboard but lost the war for the customer's wallet. They had built the "perfect" model, but not a useful product.

1.2 The Guiding Principle: Utility Trumps Accuracy

This common story of academic success translating to market failure reveals a crucial, counter-intuitive law of applied AI: The "Good Enough" Model Law. It states that the commercial success of an AI product is determined not by its proximity to state-of-the-art perfection, but by its ability to deliver tangible, user-centric utility within the constraints of the real world. The goal is not to build the best possible model in a lab, but to deploy the simplest possible model that solves a real user's problem effectively and economically.

This law is a direct corollary to the Model-Market Fit Law (Law 3), but it focuses on the starting point of development. It acts as a powerful strategic filter, forcing founders to ask "What is the minimum level of performance required to be genuinely useful?" rather than "What is the maximum level of performance we can achieve?" It champions pragmatism over perfectionism, recognizing that in business, utility is the only benchmark that matters. A "good enough" model that ships today is almost always more valuable than a "perfect" model that ships next year.

1.3 Your Roadmap to Mastery

This chapter will provide a framework for developing a pragmatic, utility-focused approach to model building. By the end, you will be able to:

Understand: Articulate the principle of the "Minimum Viable Model" (MVM) and why the pursuit of SOTA performance is often a trap. You will grasp the concepts of the "utility curve" and the point of diminishing returns in model complexity.
Analyze: Use frameworks like the Utility-vs-Complexity Map to evaluate the trade-offs between model performance and real-world value, identifying the "good enough" sweet spot for any given business problem.
Apply: Develop a product development process that prioritizes shipping value quickly. You will learn to start with the simplest baseline, iterate based on user feedback, and resist the siren song of the academic leaderboard in favor of delivering practical, user-centric utility.

2. The Principle's Power: Multi-faceted Proof & Real-World Echoes

2.1 Answering the Opening: How "Good Enough" Resolves the Dilemma

Let's re-imagine TranslatePerfect's journey through the lens of the "Good Enough" Model Law. Instead of aiming for the SOTA BLEU score, their goal would have been to build a "Minimum Viable Model" that could solve a specific, high-value problem for a niche set of initial customers.

They might have targeted law firms specializing in international contracts. Their first step would not be to build a massive general model, but to interview lawyers to understand their true pain points. They would learn that the lawyers don't need a perfect, literary translation; they need a translation that is (a) extremely fast, (b) handles legal jargon correctly, and (c) allows for easy comparison with the original document.

Armed with this insight, the team would build a much smaller, more specialized model. They would fine-tune a fast, efficient open-source model on a small, high-quality dataset of legal contracts. Their resulting model might have a lower overall BLEU score than their original creation, but for the specific task of translating legal documents, it would be vastly superior. It would be faster, cheaper, and more accurate on the vocabulary that actually mattered to the user. They would wrap this "good enough" model in a full-stack application (Law 6) with a user interface designed for lawyers, perhaps showing the original and translated text side-by-side. This product, while less impressive on an academic leaderboard, would be incredibly useful and would find a ready market. They would have prioritized utility over abstract accuracy.

2.2 Cross-Domain Scan: Three Quick-Look Exemplars

The most successful applied AI products often start with surprisingly simple models that are "good enough" to be useful.

Email (Google's Smart Reply): When Google first launched Smart Reply, the suggested responses ("Sounds good!", "I'm on it," "No, I can't") were not generated by a massive, SOTA language model. It was a relatively simple system based on recurrent neural networks. It wasn't perfect; it couldn't handle complex queries. But it was "good enough" to handle a large fraction of common email replies, saving users a few seconds hundreds of times a day. The utility was enormous, even if the model was far from perfect.
Spelling & Grammar (Grammarly): The first versions of Grammarly were not powered by massive, GPT-3-class models. They were based on more traditional, rule-based, and statistical NLP techniques. They couldn't write beautiful prose, but they were "good enough" at catching common, embarrassing errors to provide immense value to millions of users. They solved the core, high-utility problem first and then incrementally improved their models over a decade, funded by the revenue from their "good enough" solution.
Customer Support (Early Chatbots): The first generation of customer support chatbots were widely mocked for their simplicity. They were often simple decision-tree-based systems that could only answer a very narrow set of FAQ-style questions. They failed constantly. But for the small percentage of queries they could handle (e.g., "What is my account balance?"), they were "good enough." They provided an instant answer for a simple question, deflecting a small but significant volume of calls from expensive human agents. Their utility was in their cost-saving, not their conversational brilliance.

2.3 Posing the Core Question: Why Is It So Potent?

In email, writing, and customer support, the initial winners were not those with the most advanced AI, but those who found a way to deliver real utility with the simplest possible technology. They proved that a "good enough" model in the hands of users is infinitely more valuable than a perfect model in the lab. This recurring pattern raises the crucial question: What are the deep economic and strategic forces that make the "good enough" principle such a powerful and winning strategy in applied AI?

3. Theoretical Foundations of the Core Principle

3.1 Deconstructing the Principle: Definition & Key Components

The "Good Enough" Model Law is a strategic principle that advocates for prioritizing the rapid deployment of a Minimum Viable Model (MVM) that delivers core user utility, rather than pursuing state-of-the-art performance in a lab environment.

It is built on a pragmatic understanding of three core concepts:

Minimum Viable Model (MVM): This is the AI equivalent of the Minimum Viable Product. It is the model with the least complexity (and therefore, the lowest cost and fastest development time) that can still solve the core problem for an initial set of users to a degree that they find tangibly valuable. It is defined by its utility, not its accuracy benchmarks.
The Utility Curve: This is a conceptual mapping of model performance (e.g., accuracy) to the actual, perceived value delivered to the end-user. This curve is almost never linear. It often follows an "S" shape. There is a "no-go" zone where performance is too low to be useful. Then there is a "high-value" zone where modest improvements in performance lead to large gains in utility. Finally, there is a "plateau of diminishing returns," where large, expensive improvements in performance yield almost no additional user value. The MVM lives at the beginning of the high-value zone.
The Cost-Complexity Curve: This is the mapping of model performance to the resources (time, data, compute, talent) required to achieve it. This curve is often exponential. Going from 80% to 90% accuracy might take a month. Going from 90% to 95% might take six months. Going from 95% to 99% might be a multi-year, multi-million dollar research project.

The "good enough" law states that the optimal strategy is to find the "sweet spot" where the utility curve is high and the cost-complexity curve is still low.

3.2 The River of Thought: Evolution & Foundational Insights

This principle is a direct application of time-tested engineering and business wisdom to the specific economics of AI development.

Pareto Principle (The 80/20 Rule): The Pareto Principle states that for many events, roughly 80% of the effects come from 20% of the causes. In AI product development, this means that you can often deliver 80% of the potential user value with the first 20% of the development effort. The "good enough" model is the one that captures that first 80% of value. Chasing the final 20% of value often requires 80% of the total effort and resources, a strategically poor trade-off in a fast-moving market.
"Worse is Better" (Richard P. Gabriel): In a famous essay, software developer Richard Gabriel argued that software quality does not necessarily increase with the number of features, and that a simple, "worse" design can often be preferable if it is easier to use and deploy. He contrasted complex, "do everything right" systems with simpler systems that do one thing and are easy to adopt. The "good enough" model is a "worse is better" strategy. It prioritizes speed of implementation, simplicity, and early deployment over completeness and "correctness" in the academic sense.
The Kano Model: This model of product development and customer satisfaction classifies features into different categories. "Basic" features are expected and must be present. "Performance" features are those where "more is better." "Delight" features are unexpected and provide disproportionate value. A "good enough" model focuses on nailing the "basic" and "performance" features up to the point of diminishing utility. It avoids the trap of "gold-plating" the model by chasing performance improvements that the user will not notice or value.

Time Value of Money (and Data): A core principle in finance is that a dollar today is worth more than a dollar tomorrow. The same is true for data and learning in AI. A "good enough" model shipped today starts the data flywheel (Law 2) spinning today. It begins generating proprietary data, user feedback, and market learnings today. The compounded value of that learning over a year is often far greater than the value of a "perfect" model that only starts learning a year from now. The "good enough" approach maximizes the time value of learning.
Opportunity Cost: Every month your team spends in the lab trying to eke out an extra 0.5% of accuracy is a month they are not spending on other valuable activities: talking to customers, building a better user workflow, or working on a new feature. The opportunity cost of chasing SOTA perfection is enormous. The "good enough" law forces a rational allocation of resources, prioritizing activities that deliver the most user value for the least amount of effort.

4. Analytical Framework & Mechanisms

4.1 The Cognitive Lens: The Utility-vs-Complexity Map

A simple but powerful way to frame this strategic choice is the Utility-vs-Complexity Map. This is a 2-axis graph:

Y-Axis: User Utility: How much tangible value does the model provide to the end-user? This can be measured in time saved, revenue gained, cost reduced, or simple delight.
X-Axis: Model Complexity: What is the cost to build and operate the model? This is a proxy for development time, data requirements, and inference cost.

On this map, you can plot different model architectures or approaches: - Simple Baselines (e.g., logistic regression, simple heuristics) will be in the bottom-left (low complexity, low utility). - Massive, SOTA Models will be in the top-right (high complexity, high utility). - The "Good Enough" Model is the one that sits in the "sweet spot" in the top-left of the viable zone—the point that delivers the highest possible utility for the lowest possible complexity.

The goal of an AI product team should be to actively resist the gravitational pull toward the right side of the map (increasing complexity) and instead constantly seek to find and deploy the model that sits at the "good enough" elbow of the curve.

4.2 The Power Engine: Deep Dive into Mechanisms

Why is deliberately aiming for "good enough" so strategically powerful?

The Speed-to-Market Mechanism: Simpler models can be built and deployed exponentially faster than complex ones. This speed allows a company to enter the market, establish a beachhead, start capturing user data, and begin iterating while slower-moving, perfectionist competitors are still in the lab. In many emerging AI markets, the first mover who is "good enough" can build an insurmountable lead before the "perfect" solution ever ships.
The Resource Allocation Mechanism: Chasing the last few points of accuracy is incredibly expensive. It requires more data, more compute, and more rare, high-cost talent. By consciously choosing a "good enough" performance target, a company frees up its most valuable resources—its people and its capital—to be deployed on other parts of the business, such as building a better full-stack application (Law 6) or a more effective go-to-market strategy (Law 9).
The Iteration & Learning Mechanism: A simple model is easier to understand, debug, and iterate on. When a simple model makes a mistake, it is often easier to diagnose the cause and fix it. This creates a tighter, faster feedback loop, which directly increases Experimentation Velocity (Law 7). A complex, black-box model can be a mystery even to its creators, making iteration slow and painful. Simplicity enables speed of learning.

4.3 Visualizing the Idea: The "Good Enough" Bar

A powerful mental model is to visualize a "bar" of user utility for a given problem.

The Bar: This represents the threshold of performance at which your solution becomes genuinely useful and customers are willing to adopt it.
Your Job: Your only job is to clear that bar. You don't get extra points for clearing it by a mile. You just need to get over it.
The Approach: Start with the simplest possible approach (a low-effort "hop"). Does it clear the bar? If not, incrementally add complexity (a "running start," then a "pole vault") until you just clear the bar. Then stop.

This mindset shifts the goal from "How high can we possibly jump?" to "What is the absolute easiest way to clear this specific bar?" It's a fundamental re-framing from a maximization problem to a "satisficing" problem.

5. Exemplar Studies: Depth & Breadth

5.1 Forensic Analysis: The Flagship Exemplar Study - The first iPhone Keyboard

Background & The Challenge: In 2007, the dominant smartphone paradigm involved physical keyboards (BlackBerry, Palm). Apple's decision to launch the iPhone with a purely software-based keyboard was a massive risk. The core problem was how to make typing on a small glass screen a usable, non-miserable experience.
"The Principle's" Application & Key Decisions: The team, led by Ken Kocienda, did not try to build a perfect, SOTA-level predictive text engine from day one. They started with a very simple, "good enough" model. The core innovation was a surprisingly simple autocorrect algorithm that dynamically changed the size of the invisible touch targets around each key based on the letters already typed. If you typed "t-h", the target for "e" would get bigger, and the target for "z" would get smaller.
Implementation Process & Specifics: This was not a complex language model; it was a clever application of simple probability. It didn't try to predict whole words, just the most likely next letter. It was computationally cheap, extremely fast, and "good enough" to transform the experience from impossible to merely adequate for most users. It was a Minimum Viable Model. Only in later years did Apple layer on more sophisticated prediction and swipe-based typing models.
Results & Impact: The software keyboard was a key enabler of the iPhone's revolutionary design and success. By shipping a simple, fast, and "good enough" solution, Apple was able to launch a paradigm-shifting product. A more "perfect" but slower or buggier keyboard AI would have sunk the entire enterprise.
Key Success Factors: Focus on the core utility: The goal was not perfect prediction, but reducing frustrating errors. Simplicity and Speed: The model was simple enough to run instantly on the limited mobile hardware of the day. Iterative Improvement: The keyboard has improved incrementally for over a decade, funded by the success of the initial "good enough" version.

5.2 Multiple Perspectives: The Comparative Exemplar Matrix

Exemplar	Background	AI Application & Fit	Outcome & Learning
Success: Otter.ai	People need a way to transcribe meetings and interviews. Perfect, human-level transcription is extremely difficult.	Otter.ai launched a transcription service that was not perfect. It made mistakes with names, jargon, and crosstalk. But it was "good enough." It provided a searchable, time-stamped text that was ~90% accurate, allowing users to quickly find key moments in a conversation.	Huge adoption. Users don't need a perfect transcript; they need a searchable and skimmable one. Otter prioritized speed and cost-effectiveness over perfect accuracy, and delivered immense utility.
Warning: A "Perfect" Self-Driving Car	A car company aims to solve Level 5 autonomous driving (works anywhere, anytime) before shipping anything. They spend a decade and billions of dollars in R&D, refusing to release a product until it is "perfect."	They are chasing the last 1% of edge cases, which is exponentially harder than the first 99%. They are stuck in a research phase, with no real-world data flywheel.	Competitors like Tesla, who shipped a "good enough" Level 2 driver-assist system years ago, have collected billions of miles of real-world data, allowing them to iterate and improve much faster. The "perfect" solution may never ship, or if it does, it will be years behind the "good enough" solution that learned from the real world.
Unconventional: A Simple Fraud Heuristic	An early e-commerce site is plagued by fraud. They don't have an AI team.	An engineer writes a simple, rule-based "model": `if (shipping_address_country != billing_address_country) and (order_value > $500) then flag_for_review`.	This simple heuristic is far from perfect, but it's "good enough." It likely catches a significant chunk of the most obvious fraud, providing immediate value at virtually zero development or computational cost. This is the ultimate MVM.

6. Practical Guidance & Future Outlook

6.1 The Practitioner's Toolkit: Checklists & Processes

The "Start with a Heuristic" Process: 1. Define the Job: Clearly state the user problem. (e.g., "Help me find relevant documents.") 2. Brainstorm the Dumbest Thing That Could Possibly Work: What is the absolute simplest heuristic or rule? (e.g., "Return documents that contain the exact search keywords.") 3. Ship It: Deploy this simple baseline. It is your MVM. 4. Measure Its Utility: Where does it succeed? Where does it fail spectacularly? Gather user feedback and real-world data. 5. Beat the Heuristic: Now, your data science team has a clear goal: build a machine learning model that delivers a better result than the simple, "dumb" heuristic. This provides a clear benchmark for what "good enough" means.

The "Complexity Budget": - Treat complexity like money. Give your team a "complexity budget" for a new feature. - For example: "We need a recommendation feature. You have two weeks of development time and it must run for less than $0.001 per call." - This forces the team to find the most valuable solution within those constraints, preventing them from defaulting to a massive, expensive SOTA model. It frames the problem in terms of business value, not academic benchmarks.

6.2 Roadblocks Ahead: Risks & Mitigation

The "Researcher's Ego" Trap: Brilliant data scientists are often trained and incentivized in academia to push the boundaries of SOTA. It can be culturally difficult to convince them to ship a "simple" model.
- Mitigation: Re-align incentives. Reward teams for shipping user value and moving business metrics, not for publishing papers or topping leaderboards. Celebrate the business impact of a simple model more than the technical elegance of a complex one.
The "Technical Debt" Concern: A "good enough" approach can be misread as an excuse to write sloppy, un-maintainable code.
- Mitigation: Differentiate between "simple" and "sloppy." A good enough model should be built on a clean, composable, and well-engineered foundation (Law 5). The simplicity is in the model's logic, not in the quality of the surrounding code.
Defining "Good Enough" is Hard: The utility bar is not always obvious and can be subjective.
- Mitigation: This is why a rapid experimentation velocity (Law 7) is so crucial. The best way to find the bar is to run live tests. Ship the simple version and measure user behavior and satisfaction. The data will tell you if you've cleared the bar or not.

6.3 The Future Compass: Trends & Evolution

The "Good Enough" principle will become even more important as AI progresses.

The Commoditization of SOTA: As massive foundation models (from Google, OpenAI, etc.) become easily accessible via API, "SOTA performance" on many general tasks will become a commodity. A startup's value will not be in trying to beat these models, but in fine-tuning smaller, cheaper, "good enough" versions on proprietary data for a specific, vertical task.
The Rise of "Small AI": The industry is beginning to shift its focus from ever-larger models to smaller, more efficient models that can run on-device or at the edge. These "Small Language Models" or "edge models" are the epitome of the "good enough" law—they trade peak performance for massive gains in speed, privacy, and cost-effectiveness.
AI as a "Good Enough" Creator: Generative AI is often not about creating a perfect, final product, but about creating a "good enough" first draft. An LLM that writes a "good enough" marketing email or a diffusion model that creates a "good enough" stock photo provides immense utility by saving the human user from the friction of starting with a blank page.

The future of applied AI is not a quest for god-like, perfect intelligence. It is a pragmatic search for useful, affordable, and reliable tools that solve real human problems. And for that, "good enough" is almost always good enough.

6.4 Echoes of the Mind: Chapter Summary & Deep Inquiry

Chapter Summary:

The "Good Enough" Model Law advises prioritizing user-centric utility over state-of-the-art technical perfection.
The goal is to find the Minimum Viable Model (MVM)—the simplest model that clears the bar of user utility.
Chasing the last few percentage points of accuracy often comes at an exponential cost in time and resources, with diminishing returns in user value.
A "good enough" model that ships today allows you to start the data flywheel and learning loop immediately, which is a massive strategic advantage.
Use frameworks like the Utility-vs-Complexity Map and the "Start with a Heuristic" process to maintain a pragmatic, value-driven approach to model development.

Discussion Questions:

Think of an AI feature you use regularly (e.g., a recommendation algorithm, a voice assistant, a generative AI). Is it "perfect"? Where does it fail? Why is it still "good enough" for you to continue using it?
The text argues against chasing academic benchmarks like BLEU scores. Are such benchmarks ever useful in a commercial context? If so, when and how should they be used?
How does the "Good Enough" Model Law interact with the Human-in-the-Loop Law (Law 4)? How can a human expert help a "good enough" model deliver a "perfect" outcome for the end-user?
Imagine you are a product manager and your data science team is adamant that they need another six months to build a "more accurate" model. How would you frame the argument to convince them to ship a simpler version sooner? What data would you use to make your case?
Is there a danger that a market saturated with "good enough" products will lead to a world of mediocre, unreliable AI? What forces, if any, will push companies to move beyond "good enough" over time?