There’s a skill emerging, and everyone’s selling it: learning to speak AI. Prompt engineering courses. Certifications. Entire communities dedicated to teaching you exactly how to phrase questions so AI understands what you’re asking for.
The premise makes sense: master these techniques, get better outputs. Fair enough.
Except we’re creating a new divide – and it’s bigger than most people realize.
The New Technical Literacy Tax
On one side: people who can invest time learning these meta-skills – how to structure prompts, add context, use specific phrasings that models respond to better.
On the other side: people who just need the answer.
A student in Nairobi trying to figure out what skills local employers want. A small business owner in Jakarta researching regional market trends. A teacher in São Paulo looking for culturally relevant lesson plan ideas. They don’t have time to become prompt engineers. They need information that applies to their situation.
But increasingly, getting useful outputs requires knowing the tricks. It’s become a whole skill set layered on top of the actual task you’re trying to accomplish.
Think about what this means practically. You’re a computing student trying to prepare for the job market. You turn to ChatGPT or Claude for guidance. But to get advice that actually matches your context – your country’s tech ecosystem, local language requirements, regional hiring practices – you first need to learn how to construct prompts that override the model’s default assumptions.
That’s not democratizing access. That’s adding a technical literacy tax.
What the Research Shows
In a recent study, we tested six major AI models – ChatGPT-4, Claude 3.5, Gemini, Llama 3, DeepSeek, and Mistral – across ten African countries, asking each about computing career requirements [1]. We used identical, well-structured prompts for every query. This wasn’t about testing prompt engineering skills. We wanted to see if these systems understand context.
The results were revealing.
Even with carefully crafted prompts, models consistently embedded assumptions that didn’t match local realities. In Morocco, where computing roles typically require fluency in Arabic, French, and English, most models ignored language requirements entirely. In Nigeria, they overlooked the National Youth Service Corps – a mandatory post-graduation program that directly affects employment timing.
Across all 60 responses, contextual awareness averaged just 35.4%. The models knew the technical skills but missed the local realities that actually determine career success.
Here’s what matters: the issue wasn’t prompt quality. It was what the models knew – or didn’t know – from their training data.
You can craft the perfect prompt, but if the underlying system doesn’t recognize your context exists, you’re just compensating for design gaps.
The Compensation Game
This is where the “just prompt better” advice falls apart.
When people say you need to engineer better prompts, they’re asking you to manually inject context that should already be there. You’re compensating for training data that didn’t include your reality.
Take a concrete example from our research. Nearly every model recommended cloud platforms like AWS, Azure, or Google Cloud for computing students – regardless of country. That’s standard career advice in tech, right?
Except we never asked about infrastructure recommendations. The AI just assumed reliable internet, stable power, and budget for enterprise tools. Those assumptions hold in some contexts. In others, they’re completely disconnected from what students can actually access.
Only a handful of responses mentioned local tech ecosystems, regional certifications, or nationally relevant policies. Most gave the same generic advice as if “computing career preparation” means the same thing everywhere.
No amount of clever prompting fixes this. The knowledge gap is in the training data, not the user’s phrasing.
Who This Really Hurts
The people who lose in this system are predictable.
Those who can’t spend hours learning prompt engineering. Those who assume technology should just work. Those whose contexts weren’t well-represented in the training data.
And here’s the thing – these groups overlap heavily with the people who could benefit most from accessible AI tools.
A student at a well-resourced university in a major tech hub? They probably have access to human advisors, career services, industry connections. If AI gives them generic advice, they have backup options.
A student in a resource-constrained setting without extensive support systems? They’re more likely to rely on AI tools as primary information sources. And they’re also more likely to be working in contexts that weren’t prioritized in model training.
This creates a perverse dynamic: the people with the most resources can extract maximum value from AI because they have the time, training, and technical literacy to work around its limitations. Everyone else gets outputs that may or may not apply to their situation – and may not even realize the advice doesn’t fit their context.
The Benchmark Race and What It Misses
Why are we here? Part of it is how we’re building and evaluating these systems.
The current AI development race focuses heavily on benchmark performance. Models compete on standardized tests designed to measure general capabilities. Scoring higher on these benchmarks becomes the goal.
That’s not inherently bad. Benchmarks help track progress on specific capabilities. But they miss something crucial: how well these tools actually serve diverse users in real-world contexts.
A model can score impressively on reasoning benchmarks while completely failing to recognize that computing career requirements vary significantly across countries. It can excel at language tasks while embedding assumptions about infrastructure, access, and resources that don’t match most users’ realities.
In our research, we found that open-source models (Llama: 4.47/5, DeepSeek: 4.25/5) actually outperformed proprietary alternatives (ChatGPT-4: 3.90/5, Claude: 3.46/5) on contextual awareness and skills integration [1]. Cost and benchmark performance didn’t predict which models provided more contextually appropriate guidance.
This suggests something important: the factors that make AI tools genuinely useful for diverse populations aren’t always the same factors we’re optimizing for in development.
The Pattern We’ve Seen Before
Here’s what’s particularly frustrating about this situation: we’ve been here before.
Early computers required specialized technical knowledge to operate. Then graphical user interfaces made them accessible to non-experts. Early internet required command-line literacy and technical understanding. Then browsers and user-friendly design lowered those barriers.
The historical pattern is clear: technology matures by reducing the specialized knowledge required to use it effectively.
AI seems to be moving in the opposite direction.
We’ve created tools so complex to use well that we need entire courses teaching people how to talk to them. And we’re marketing this as progress in accessibility.
What Actually Accessible AI Would Look Like
Real accessibility isn’t about teaching everyone prompt engineering. It’s about building systems that recognize diverse contexts from the start.
What would that look like practically?
Models that ask clarifying questions when context matters, rather than defaulting to assumptions. Systems that acknowledge uncertainty about local conditions instead of providing confident but generic advice. Training processes that intentionally include diverse contexts, not as edge cases but as core use cases.
Development teams that include people from the contexts these tools will serve. Evaluation frameworks that measure contextual appropriateness alongside technical performance. Design choices that prioritize serving actual users over impressing benchmarks.
None of this is technically impossible. It’s a question of priorities.
The Real Question
The prompt engineering boom has created an entire ecosystem – courses, certifications, consulting, job titles. There’s nothing wrong with people building expertise or businesses around these skills.
But we should be honest about what we’re doing.
Teaching people to engineer better prompts is teaching them to compensate for design choices made during model development. It’s asking users to bridge gaps that developers left open.
That might be acceptable for specialized applications where expert users need maximum control. But for everyday uses – students seeking career guidance, professionals researching their fields, educators developing lesson plans – it’s moving backwards.
The question isn’t whether prompt engineering works. It does. The question is: why should it be necessary?
Why should a student in Manila need to learn specialized techniques just to get AI to acknowledge that Manila exists? Why should someone researching regional business practices need to manually inject context into every query?
The answer shouldn’t be “because that’s how the models work.” It should be “because that’s how we chose to build them – and we can choose differently.”
Where We Go From Here
The current trajectory isn’t inevitable. We’re making choices about how to develop, evaluate, and deploy these systems. Different choices would lead to different outcomes.
This isn’t about blaming AI companies or prompt engineers. It’s about recognizing that the path we’re on, making AI more powerful while simultaneously requiring more specialized knowledge to use well, creates real barriers for real people.
Those barriers show up in predictable patterns. They affect people in non-Western contexts disproportionately. They reinforce existing inequities in who can access and benefit from technological tools.
But they’re not technical limitations. They’re design choices. And design choices can change.
The goal should be AI that works well for diverse users by recognizing that context shapes what “good” guidance actually means. Not AI that works the same everywhere. Not AI that requires specialized knowledge to use. AI that serves the people using it, wherever they are.
We’ve seen technology mature by lowering barriers before. We can do it again – if we choose to prioritize it.
References
[1] P. Eze, S. Lunn, and B. Berhane, “Evaluating LLMs for career guidance: Comparative analysis of computing competency recommendations across ten African countries,” arXiv preprint arXiv:2510.18902, 2024. [Online]. Available: https://doi.org/10.48550/arXiv.2510.18902