

You’re literally quoting marketing materials to me. For what it’s worth, I’ve already done more than enough research to understand where the technology is at; I dove deep into learning about machine learning in 2020, when AlphaFold 2 was taking the structural biology world by storm — I wanted to understand how it had done what it had, which started a long journey of accidentally becoming a machine learning expert (at least, compared to other biochemists and laypeople).
That knowledge informs the view in my original comment. I am (or at least, was) incredibly excited about the possibilities, and I do find much of this extremely cool. However, what has dulled my hype is how AI is being indiscriminately shoved into every orifice of society when the technology simply isn’t mature enough for that yet. Will there be some fields that experience blazing productivity gains? Certainly. But I fear any gains will be more than negated through losses in sectors where AI should not be deployed, or it should be applied more wisely.
Fundamentally, when considering its wider effect on society, I simply can’t trust the technology — because in the vast majority of cases where it’s being pushed, there’s a thoroughly distrustful corporation behind it. What’s more, there’s increasing evidence that this just simply isn’t scalable. When you look at the actual money behind it, it becomes clear that the reason why it’s being pushed as a magical universal multi tool is because the companies making these models can’t make them profitable, but if they can drum up enough investor hype, they can keep kicking that can down the road. And you’re doing their work for them — you’re literally quoting advertising materials for me; I hope you’re at least getting paid for it.
I remain convinced that the models that are most prominent today are not going to be what causes mass automation on the scale you’re suggesting. They will, no doubt, continue to improve — there’s so many angles of attack on that front: Mixture of Experts (MoE) and model distillation to reduce model size (this is what made DeepSeek so effective); Retrieval Augmented Generation (RAG) to reduce hallucinations and allow for fine-tuning of output based on a small scale based on a supplementary knowledgebase; reducing the harmful effects of training on synthetic data so you can do more of it before model collapse happens — there’s countless ways that they can incrementally improve things, but it’s just not enough to overcome the hard limits on these kinds of models.
My biggest concern, as a scientist, is that what additional progress there could be in this field is being hampered by the excessive evangelising of AI by investors and other monied interests. For example, if a company wanted to make a bot for low-risk customer service or internal knowledgebase used RAG, this would require the model to have access to high quality documentation to draw from — and speaking as someone who has contributed a few times to open-source software documentation, let me tell you that that documentation is, on average, pretty poor quality (and open source is typically better than closed source for this, which doesn’t bode well). Devaluing of human expertise and labour is just shooting ourselves in the foot because what is there to train on if most of the human writers are sacked.
Plus there’s the typical old notion around automation leading to loss of low skilled jobs, but the creation of high skilled roles to fix and maintain the “robots”. This isn’t even what’s happening, in my experience. Even people in highly skilled, not-currently-possible-to-automate jobs are being pushed towards AI pipelines that are systematically deskilling them; we have skilled computer scientists and data scientists who are unable to understand what goes wrong when one of these systems fucks up, because all the biggest models are just closed boxes, and “troubleshooting” means acting like an entry level IT technician and just trying variations of turning it off and on again. It’s not reasonable to expect these systems to be perfect — after all, humans aren’t perfect. However, if we are relying on systems that tend to make errors that are harder for human oversight to catch, as well as reducing the number of people trying to catch them, then that’s a recipe for trouble.
Now, I suspect here is where you might say “why bother having humans try to catch the errors when we have multimodal agentic models that are able to do it all”. My answer to that is that it’s a massive security hole. Humans aren’t great at vetting AI output, but we are tremendously good at breaking it. I feel like I read a paper for some ingeniously novel hack of AI every week (using “hack” as a general term for all prompt injection, jailbreak etc. stuff). I return to my earlier point: the technology is not mature enough for such widespread, indiscriminate rollout.
Finally, we have the problem of legal liability. There’s that old IBM slide that’s repeatedly done the rounds the last few years that says “A computer can never be held accountable, therefore a computer must never make a management decision.”. Often the reason why we need humans to keep an eye on systems is that legal systems demand at least the semblance of accountability, and we don’t have legal frameworks for figuring out what the hell to do when AI or other machine learning systems mess up. It was recently in the news about police officers going to ticket an automated taxi (a Waymo, I think) when it broke traffic laws, and not knowing what to do when they found it was driverless. Sure, parking fines can be sent to the company, that doesn’t seem too hard to write regulations for, but with human drivers, if you incur a large number of small violations, it’s typical to end up with a larger punishment, such as one’s driver’s licence being suspended. What would even be the equivalent level of higher punishment for driverless vehicles? It seems that no-one knows, and concerns like these are causing regulators to reconsider the rollout of them. Sure, new laws can be passed, but our legislators are often tech illiterate, so I don’t expect them to easily be able to solve what prominent legal and technology scholars are still grappling with. That process will take time, and the more that we see high profile cases like suicides following chatbot conversations, the cautious legislators will be. Public distrust of AI is growing, in large part because they feel like it’s being forced on them, and that will just harm the technology in the long run.
I genuinely am excited still about the nuts and bolts of how all this stuff works. It’s my genuine enthusiasm that I feel situates me well to criticise the technology, because I’m coming from an earnest place of wanting to see humans make cool stuff that improves lives — that’s why I became a scientist, after all. This, however, does not feel like progress. Technology doesn’t exist in a vacuum and if we don’t reckon with the real harms and risks of a new tool, we risk shutting ourselves off to the positive outcomes too.











There are no bad dogs, only bad dog owners. And whilst I’m sympathetic to owners of dogs with eldritch powers, I will absolutely hold them responsible if they own a dog that’s unsuited for their lifestyle and capability. If they weren’t up to the task, they should have gone for an easier to handle breed, like a border collie, or a husky.