It’s Not AI, and It Wasn’t Built to Be Accurate

I'd already been getting frustrated with the discourse around LLMs, but watching the latest Humane demo and fallout put me over the edge.

In my last post on this topic, I said, “GPT has no concept of accuracy or correctness, so it's wild to see... well-known companies placing this volatile service in front of their customers with relatively few precautionary measures.” I thought I was being a bit punchy, but time has proven this take to be tame.

Humane has just launched their long-awaited flagship product, the Humane Ai Pin. Despite the dry video, I was taken with the sci-fi-like device brought to life. I assumed that Humane had somehow overcome the accuracy problem, as they surely wouldn't launch an entire product predicated on the assumption that an LLM left to its own devices would return accurate information.

I was wrong. Humane released a prerecorded and edited launch video featuring not one but two wholly inaccurate responses from the “AI” pin.

Humane is receiving a lot of flak about this right now, but they're far from the only ones with this issue. My research today led me to Google, “What are the technical differences between GPT-3 and GPT-4?” and Google's big bold pull quote at the top was all about how much more powerful GPT-3 is than GPT-4.

LLMs are breaking the internet.

ChatGPT (and LLMs in general) weren't built for this. LLMs were built with the singular goal of emulating human language - sounding human isn't the same as being accurate, although some level of the latter is required to accomplish the former. Any incidental accuracy is a byproduct of the fact that it's trained to sound authentic.

After all this LLM-bashing, I do have to say that this technology can be useful. To make the most of it, however, we need to understand its limitations and build accordingly.