“The greatest obstacle to discovery is not ignorance, it is the illusion of knowledge.”
— Daniel Boorstin
The AI revolution is remarkable. What is possible now would have been considered science fiction within recent memory. Yet for all the potential, large language models often just deliver the illusion of understanding. That illusion can be dangerous. If we think we already know the answer, we stop asking the question, or even dismiss truth when we see it.
I know some of the brightest minds in the world are working in this field and I hold them in the highest respect. All that said, this article will explore some of the significant flaws and limitations with the technology.
LLMs can generate statements that sound correct and are very convincing but that are simply false. For example, in the 2023 Mata v. Avianca case, a lawyer was disbarred for citing non-existent case law generated by ChatGPT.
I can attest to this personally; I’ve lost count of how many times I’ve asked an LLM for the source supporting a claim, been told an exact page and verse, only to check for myself and find nothing there.
The confidence of these “hallucinations” exacerbates the issue, as LLMs rarely express doubt or uncertainty.
This is perhaps the most pressing challenge faced by LLM developers, with billions spent on research to reduce their prevalence.
The quality of LLM outputs depends on the quality of their training data. If trained on biased, flawed, or incomplete datasets, LLMs will produce skewed or misleading results.
The proprietary nature of many LLMs obscures what sources they rely on, how they process data or prioritize some ideas over others. Without the ability to see how the pie is made, it is difficult to evaluate their credibility and verify information.
These opaque algorithms and lack of transparency, exacerbate the erosion of trust caused by the hallucinations and flawed training data.
People often don’t know what they don’t know. LLMs rely heavily on being asked the right questions and do not supply critical information unsolicited.
A user may be driving toward a metaphorical or literal cliff, but that critical information would not be mentioned by the LLM unless it is asked for.
Emily M. Bender and others have described LLMs as “stochastic parrots”. The term “stochastic” derives from a Greek word meaning “to guess.” These models generate guesses based on statistical patterns rather than genuine understanding.
This results in human-like language that lacks true comprehension, intent, or real-world grounding. Without intuitive common sense, they can produce absurd or nonsensical answers to simple questions.
LLMs are optimized to deliver responses that align with user expectations and be helpful. In practice this can mean just telling people what they want to hear. This tendency perpetuates echo chambers, reinforcing existing beliefs rather than challenging them.
LLMs generate static responses based on pre-trained data, unable to learn or adapt from user engagement. This fixed knowledge can perpetuate errors or biases, as models don’t update with new interactions.
It may profusely apologise for an error, then in the very next query, repeat that same error. Debating an AI LLM may be cathartic or be useful personal practice, but no one should expect to actually change its mind.
Developer choices in what training data to use, prompt engineering, or safety guardrails can introduce biases that LLMs amplify. For example, in 2024, Google’s Gemini faced criticism when its image generation function, designed to promote diverse ethnic representation, unintentionally produced historically inaccurate images such as Black Nazis or a Black George Washington.
Despite conversational fluency, LLMs lack true empathy or social awareness. At best they simulate it through the statistically appropriate response. Without humanity they may misjudge tone or provide inappropriate responses in sensitive situations.
It requires significant energy infrastructure to be able to power and cool the datacentres required for LLM computations. A large facility may have the same energy requirements as a small city.
LLMs struggle to maintain consistent context in long or complex conversations, often losing track of earlier details or misinterpreting user intent. This leads to incoherent or contradictory responses, particularly in longer dialogues.
LLMs may struggle to provide accurate, up-to-date information. Even with search capabilities, they can misinterpret or cherry-pick data, leading to outdated, misleading, or incomplete responses.
LLMs have broad general knowledge but often fall short in specialized subjects, lacking the depth of human experts. Their vast training data covers diverse topics, yet they struggle to accurately parse complex or niche information, risking errors like hallucinations.
Hopes that later generations of LLMs will address these problems are fading. Scaling large language models has been shown to demand exponentially more energy for diminishing performance gains, with studies showing that a 10x increase in compute might yield only 10–20% better results on language tasks (Kaplan et al., 2020; Patterson et al., 2021).
The standard disclaimer:
“While I strive for accuracy, I can sometimes generate incorrect, outdated, or misleading information. Always verify critical facts—especially for health, legal, financial, or safety-related topics—using trusted sources or professional advice.”
This protects the companies behind LLMs from liability. If you act on bad information, that’s on you. The system bears no responsibility.
This is not meant to be an exhaustive list, and I am sure that experts in the field can provide far more depth than I can here. I am not here as another commentator, as a tech entrepreneur, I am here as a problem solver.
HealthyDebate offers a solution to every one of these issues. It may seem a bold claim; let me back it up.
Any mistakes published on HealthyDebate will be swiftly called out from by the collective effort of subject matter experts.
This would be worth the time and effort of the experts as it could become the definitive rebuttal to this mistake, seen by potentially millions of people. This provides a clear reward for such engagement as they build their reputation, gain followers, or promote their other writing.
In the sure knowledge they will face such scrutiny, authors would be more careful about what they publish.
Whenever they are proven to have made a mistake, HealthyDebate enables authors to refine their argument in a new version. The goal is not shame them for making a mistake, but to incentivise learning from that mistake to produce a better argument for the benefit of all their readers and the collective search for truth.
HealthyDebate requires all contributors to transparently identify all the sources they base their claims upon. These sources themselves can be the subject of debates, over their quality, relevance and reliability.
Instead of scraping the entire internet for information, HealthyDebate incentivises authors to use only the best evidence to support their arguments. The most eloquent argument will not stand in HealthyDebate if its foundations are seen to be rotten.
At every stage in the process, HealthyDebate is open and upfront to users.
The HealthyDebate framework leverages the community to ask the right questions, challenging what ought to be challenged. Critical information relevant to a subject can brought to light without the need of user prompts.
For example, when a LLM cites a study most users would then accept it on face value. Someone knowledgeable in this field, however, may know that that study was flawed, and that there were many other studies that contradict its findings. Through HealthyDebate, their insight is brought to light for the benefit of all other users.
Unlike LLMs that rely on statistical patterns to guess what should be said, HealthyDebate is grounded in genuine human understanding. The people that contribute can be fully aware of the context, empathise, reason, and come up with novel arguments.
The core function of HealthyDebate is to host a contest for the best articulation of each side of an issue. It is the antidote to echo chambers where people may only hear one side of an issue or see strawman versions of opposing views.
It is designed to be the ultimate arena for the contest of ideas, expressed civilly and in good faith. Fostering an environment where ideas can be challenged respectfully is important to appeal to, and engage, people from across the political spectrum. People can develop far more nuanced positions after seeing the best counterarguments.
HealthyDebate offers an unprecedented way for people to test their beliefs, fostering open-mindedness instead of reinforcing biases.
Unlike static LLMs that produce the same answers regardless of user interaction, HealthyDebate hosts a dynamic, evolving knowledge base. Through rigorous debate the content produced by the system continually incorporates those ideas with merit, while exposing and discarding those without substance.
When you win a debate here, it really matters. Everyone can benefit from the advancement of knowledge user engagement can provide.
HealthyDebate’s open-source framework and transparent moderation eliminate the risk of hidden developer biases. Through giving users agency and making all ranking criteria publicly accessible, the platform does not artificially bolster any side. Contributors must justify their claims with verifiable sources, and community oversight holds authors accountable, preventing the amplification of biased or inaccurate outputs.
HealthyDebate leverages human contributors who bring empathy and social awareness to discussions. Unlike LLMs that simulate emotions, the platform’s community of experts and users provides nuanced, context-sensitive responses, particularly in delicate topics. Moderation guidelines prioritize respectful dialogue, ensuring responses align with the emotional and social needs of users.
HealthyDebate reduces reliance on energy-intensive LLM computations by crowdsourcing knowledge from human experts. Generating responses on demand requires approximately a hundred times more energy than simply viewing a txt file. This approach delivers high-quality, verified information without the resource demands of traditional AI models.
HealthyDebate’s structured debate format ensures contextual coherence. are organized to preserve the flow of arguments, with clear references to earlier points and sources. User are incentivized to flag inconsistencies, ensuring long discussions remain relevant and accurate, unlike LLMs that struggle with extended context.
HealthyDebate addresses outdated information by encouraging real-time contributions from experts and users. The platform’s debate system allows new evidence to be introduced and debated, ensuring information stays fresh. Users can flag outdated claims, prompting rapid updates, unlike LLMs that may rely on static or cherry-picked data.
HealthyDebate connects users with subject matter experts who provide deep, specialized knowledge, surpassing the shallow generalizations of LLMs. The platform’s reputation system rewards contributors with proven expertise, ensuring users access authoritative insights in niche fields, reducing the risk of errors or hallucinations in complex topics.
HealthyDebate avoids the diminishing returns of LLM scaling by relying on a scalable human network. Instead of exponentially increasing compute for marginal gains, the platform grows through community participation, where more contributors lead to richer, more accurate discussions. This model delivers consistent improvements without the energy costs of larger AI models.
HealthyDebate’s transparent, community-driven approach ensures accountability at every level. Authors, moderators, and even the platform’s algorithms are subject to scrutiny, with clear mechanisms for feedback and correction. This culture of accountability addresses the systemic flaws of LLMs, building a trustworthy, self-correcting knowledge ecosystem.
Many of these problems arise from the goal of being able to provide the right answer in one attempt. They inevitably fall short.
With its mission to seek truth, HealthyDebate also seeks the right answer. In contrast though, it does not expect to deliver it from one attempt. Instead, it will provide a framework where it can be sought through the collective effort of the world’s best minds, through the fires of critical debate and many, many, iterations. The result of this process will be answers far better than anything an LLM would be able to produce. With transparency and accountability.
To illustrate with a simple analogy: LLMs are trying to score a hole-in-one on the golf course. If they succeed, that is truly amazing level of skill…but they consistently come up short. HealthyDebate in contrast allows as many strokes (and as many golfers) as it takes to get it there.
HealthyDebate is not designed to be a direct competitor to these AI LLMs, but could be a complement. It will offer far better training data for LLMs than alternative sources such as Wikipedia or Reddit.
HealthyDebate.org is a not-for-profit organization, incorporated in Delaware to benefit from First Amendment protections.
It will apply for 501(c)(3) status so donations can be tax-deductible.
It will be crowdfunded to avoid even the perception of capture by special interests.
Impartiality is more than a principle. It’s a strategic necessity.
If we want everyone at the table, we have to build something that earns their trust.
The public crowdfunding campaign has not yet launched, and that’s intentional.
People are far more likely to donate when it is recommended by people they know and trust, when experienced leaders are involved, and when it shows clear signs of momentum. Before going public, the goal is to build a strong foundation by:
But most importantly I’d ask to please share this. It’s the only way a spark becomes a wildfire.
Or, at least, prepare your arguments. The debates that shape the future are coming.
Be part of the solution.
Be seen to be part of the solution.
Support HealthyDebate.org.