The economics of AI in classrooms — and why they're upside down

A school district has 5,000 students. They license an AI product at $20 per student per year — typical pricing for K–12 ed-tech. Total annual revenue to the vendor: $100,000.

If those students each interact with the AI for 100 minutes per month at modern API rates, the vendor’s inference cost alone — paid to whichever AI provider runs the underlying model — is somewhere between $50,000 and $150,000 per year. Depending on the specific usage pattern and provider, the vendor might be losing money on the contract before they account for their own salaries, hosting, support, sales, and the cost of acquiring the contract in the first place.

This is the economics that most education AI startups are operating with right now, and it’s the economics that’s going to kill most of them.

The fundamental mismatch

Consumer AI products built on top of frontier APIs work financially because of two things: pricing power and selective usage. ChatGPT Plus at $20/month assumes most subscribers use it lightly. The heavy users are subsidized by the casual ones. The casual ones are willing to pay $20 because the value at peak moments justifies the monthly fee.

Neither of these dynamics applies to AI sold into K–12 schools.

Schools don’t have pricing power flexibility. Per-student annual licensing in K–12 typically ranges from $5 to $50 depending on the product category. Above $50, the budget conversation becomes politically difficult. Above $100, it requires special board approval. There is no path to charging consumer-AI prices ($240/year per seat for ChatGPT Plus) for a product sold per-student to a public school.

Schools also don’t have selective usage. If you license a product for 5,000 students, and the product is genuinely good, you want all 5,000 students to use it. The whole point of an institutional license is broad deployment. There’s no equivalent of the “casual subscriber subsidizes the power user” dynamic — every student is a power user, by design.

These two facts together mean that any AI product priced at K–12-typical rates and built on top of a frontier API has unit economics that get worse as adoption goes up. The contract that proves the product works is also the contract that proves the company can’t survive.

What this rules out

Three categories of business model become very difficult under these constraints.

Pure API wrappers. A startup that takes ChatGPT or Claude or Gemini and puts a school-friendly interface on top has no way to escape the API costs. They might price aggressively to win contracts, then hope to renegotiate when at scale. This rarely works. The AI provider’s pricing is set by the AI provider; the school’s budget is set by the school. The startup is squeezed in the middle until they fail or get acquired by the AI provider for the price of their customer list.

Per-student pricing with unlimited usage. Even if the product owns its inference stack, charging a fixed per-student fee for unlimited usage creates a perverse incentive: the more value the product delivers, the worse the unit economics get. This is solvable through usage caps, but caps create user-experience friction that schools dislike.

Free for schools, monetize elsewhere. This works for some categories but not for ones requiring sustained engineering investment. Khan Academy can sustain a free model because they have non-profit funding and a brand built over decades. A startup trying to follow this path with VC money will run out of runway before the monetization plan materializes.

What this favors

Three categories of business model survive better.

Vendors who own their inference infrastructure. When the inference cost is largely fixed (the GPU is paid for whether it runs hot or cold), additional usage costs little. The unit economics improve at scale instead of degrading. This is the structural argument for sovereign AI infrastructure that I’ve made elsewhere — but it’s worth noting that it’s also a structural argument for survival in this market.

Vendors with high-value, low-volume usage patterns. AI products that are used intensively but selectively — say, a tool that teachers use for lesson planning two hours per week rather than students using a tutor an hour per day — have much better economics. The total inference burden is smaller, even at full adoption.

Vendors with multi-product portfolios that share infrastructure. A vendor who runs three products on the same inference stack amortizes the fixed costs across all three. A single-product company has to recover all the costs from one revenue line.

What this means for schools choosing vendors

If you’re a district administrator evaluating an AI product, the unit economics question is worth asking directly. Not because you care about the vendor’s profit margins, but because vendors with broken unit economics are a procurement risk. They will either raise prices substantially at renewal, get acquired and discontinued, or simply shut down — leaving you to migrate the product or do without it.

Three diligence questions worth asking:

What are the underlying inference costs for typical usage at your district’s scale? If the vendor can’t or won’t answer this, they’re either operating on an unsustainable model or they don’t understand their own economics.

What infrastructure does the inference run on, and is it the vendor’s own or a third party’s? Vendors who control their own stack have more cost flexibility. Vendors who pay per-token to a third party have less.

What’s the renewal pricing trajectory likely to look like? A startup losing money on its first contracts is going to need to raise prices or change products to survive. Better to know now.

What this means for vendors

If you’re building AI for education and your unit economics don’t work, you have three honest paths.

You can change pricing. Move to a usage-based model, a lower-functionality free tier with a paid tier above, or a per-school flat fee instead of per-student. Each of these has trade-offs but each can produce sustainable economics for some product categories.

You can change architecture. Move from third-party APIs to owned inference. This is a significant engineering investment but it’s how the cost flip becomes a cost moat instead of a cost trap.

You can change what you sell. The most AI-intensive product categories (tutoring, conversational interfaces) have the worst unit economics. Lower-intensity categories (analytics, content generation, lesson planning) have better economics for the same revenue. Choosing where to play in the AI value chain is a strategic decision that determines whether you can survive.

What you can’t do, profitably, is wrap a frontier API in a school-friendly UI and sell it at K–12 prices. The math doesn’t work. It’s working today, for some companies, because they’re spending venture capital on the difference. The venture capital will eventually run out.

A note on what this isn’t

I want to be careful here. This isn’t an argument that AI for schools is impossible. It’s an argument that the business model is harder than it appears, and that most current entrants are going to fail because their economics are wrong.

The opportunity is real. Schools desperately want AI tools that work for their context. The willingness to pay is genuine. The market is large. The question is which vendors structure themselves to capture the opportunity in a way they can sustain.

The next few years are going to thin the field considerably. The startups that survive will share certain structural features: owned infrastructure, careful usage design, multi-product portfolios that share costs, or pricing models that align cost with revenue. The startups that don’t will be remembered the same way the wrapper graveyard of 2023 is remembered now.

Choosing your structural features now, before you’ve signed contracts that lock you into bad ones, is one of the most important decisions a founder in this space can make.

Next week: I want to write about something more specific — a piece about why the standards-alignment problem is much harder than it looks, and why most AI products that claim “TEKS aligned” or “Common Core aligned” are doing something quite a bit weaker than the claim implies.