AI is getting frighteningly good at cognition (language, knowledge work, code). It is still clumsy at embodiment (the physical world), but getting better fast. It remains inconsistent at the agency (in autonomously turning goals into outcomes). However, it is very underbuilt in the category that determines whether any of this is adoptable at scale: human fluency.

Human fluency is the umbrella term I want to put a flag in. Under that umbrella live a bunch of scattered labels people already use: emotion AI, affective computing, social intelligence, theory of mind, trust calibration, persuasion safety, personality alignment, conversation dynamics. The field is fragmented in name, fragmented in measurement, and fragmented in commercial focus, which is a problem.
The four dimensions that matter
1) Cognition. This is where the money and talent have rushed first. Not because it is the only thing that matters, but because it is the easiest to scale with data and compute, and the easiest to demo. Stanford’s 2025 AI Index reports $33.9B in private investment in generative AI in 2024. That is our proxy in the chart for “focus” on cognition.
2) Embodiment. Physical intelligence is progressing, but the economics and complexity are brutal. Robots have to deal with reality, and reality does not tolerate hallucinations. PitchBook reports that robotics startups raised $6.1B in VC funding in 2024. That is our proxy for embodied AI funding focus.
3) Human fluency. Human fluency is the ability to operate inside human social physics. It is not “being polite.” It is knowing how to land a message for a specific person given their context, intent, trust level, status sensitivity, conflict style, and emotional constraints. Here is the uncomfortable point: the category exists, but it is still treated like an edge feature. The money going into it is small relative to other buckets, and the measurement is still immature. For a proxy number, the chart uses the Emotion AI market baseline from MarketsandMarkets: $2.74B in 2024 (growing to $9.01B by 2030). That is not perfect, and I will say why in a minute. But directionally, it shows what anyone watching the ecosystem already feels: the “human layer” is still a side quest.
4) Agency. Agency is what turns capability into outcomes. Planning, persistence, tool use, error recovery, and completion without constant babysitting. The cleanest evidence that agency remains a bottleneck comes from METR’s work on long-horizon tasks. Their results show a steep drop-off: models had almost 100% success on tasks that take humans under 4 minutes, but under 10% success on tasks that take humans around 4 hours. For a proxy “focus” number, the chart uses Menlo Ventures’ estimate that the application layer captured $19B of generative AI spending in 2025, which is where most “agent” products live.
So yes, these numbers are only approximate. That is the point. It is not claiming precision. It is making a directional claim: We are funding cognition and agency heavily, funding embodiment meaningfully, and funding human fluency lightly relative to the role it plays in adoption and safety.
If you want the “stronger” version of this chart later, we can replace these proxies with a single consistent measure across buckets, like VC deal volume by category or hiring headcount by category. But the core pattern is unlikely to flip.
Why human fluency is the real adoption gate
Cognition, embodiment, and agency determine what an AI can do. Human fluency largely determines whether people will actually use it, especially in situations where stakes are social: leadership, customer support, healthcare, education, coaching, and any workflow that depends on trust and delegation.
It is not true that most AI failures are social. Many failures are straightforward: incorrect facts, brittle reasoning, and poor long-horizon reliability. But a growing share of real-world adoption friction comes from human-facing failure modes that are harder to benchmark: tone mismatch, misreading intent, overconfidence, and responses that are technically correct but socially counterproductive. These problems are not “soft.” They show up as churn, escalations, non-adoption, and refusal to delegate.
Human fluency is also one of the places where we are still arguing about measurement. A recent position paper on Theory of Mind evaluation in LLMs argues that many ToM benchmarks are “broken” because they do not test what we actually care about, namely whether models can adapt to new partners over time.
That matters because without functional human fluency, the system can be socially persuasive without being socially grounded. That is the worst combination: high language skill, weak interpersonal truth.
Why this deserves a research push now
If we keep treating human fluency as a decorative layer, we get predictable outcomes:
- More deployment failures caused by tone and trust.
- More user overreliance on systems that feel socially competent but are not stable.
- More “agent” ambition with weak interpersonal safety rails.
- Slower adoption in high-stakes domains where relationship dynamics matter.
Human fluency needs the same thing cognition got: serious measurement, serious benchmarks, serious funding, serious institutional interest.
At minimum, the field needs:
- A shared umbrella term (e.g., human fluency) so research and product work stop fragmenting into a dozen tiny subfields.
- Benchmarks that test longitudinal interaction, not single-turn social theater.
- Calibration metrics for social trust, not just “did the user like the response.”
- Clear failure-mode taxonomies for persuasion, manipulation, sycophancy, and dependence risk in human-facing systems.
- Interventions that can be layered onto existing LLMs, because waiting for “full AGI” to fix this is a cop-out.
Bottom line
We are building machines with an accelerating head, early-stage hands, an inconsistent spine, and an underbuilt heart. The funding numbers in the chart are imperfect, but they are honest enough to reveal the pattern: the category that governs trust and adoption is still the least prioritized. Human fluency deserves to be named, measured, funded, and treated as a first-class research field.





0 Comments