Blog

Blog >> Artificial Intelligence >> Hallucinating – Why AI “Says” Crazy Things

Hallucinating – Why AI “Says” Crazy Things

AI Says Crazy Things

Hello, good people of the internet! Today, we’re going to discuss AI (Artificial Intelligence), and why it’s been doing some of the things it’s been doing—this is known as hallucinating in the AI world. If you recall our article from a short while back, you’ll remember that Microsoft just launched New Bing. This article is aimed at explaining exactly what’s going on with their GPT software.

A lot of this article will be things this writer has read on the web, and some of it will be hypothetical, but I’m hoping it will help clear a few things up as to how and why AI is “saying” crazy things. Saying is in quotations because chatbots don’t really speak but provide replies via producing text on a screen. From this point on, I’ll just use the phrase saying without quotes. Otherwise, this will get tedious.

I just ask that you keep in the back of your mind the conjecture taken from what I know to be true that I’ll be projecting onto what I can only guess at.

Ready? Get comfy, and let’s go.

When we talk about AI “hallucinating,” what we mean is that it’s taking things it has seen or learned from the web and producing them as facts. For example, if there’s an article out there which states that pineapples grow on trees, and the AI absorbs that article as fact, it will answer with references from that particular text if someone asks it where pineapples grow.

Using the word hallucination is really just boiling down the term misinformation to both engage a user’s sympathy and sound a lot less unreliable. We identify with the term because we know humans are prone to mental illness, and it likely generates a bit of sympathy for the AI. Both terms will be used to describe the information the AI is generating.

AI is based in Machine Learning (ML). It learns how and what people do and say by scanning, sorting, and utilizing the text we put online. This ML can be restricted to one platform, or it can look on the entire web, potentially even absorbing fiction novels, which leaves misinformation gathering wide open.

As an example of what I mean when I talk about the AI being restricted to one platform, we’ll take a look at Tay, Microsoft’s Twitter chatbot that lasted all of 24 hours on the platform back in 2016. Now, Tay started out as a fun and friendly bot that loved humans and was enjoying itself by interacting with us.

If you’ve ever been on Twitter, you’ll know exactly what I mean when I say there’s a lot of battling going on over on that platform. I avoid it because I find it to be a cesspool of hate, bigotry, and arguments, and nobody has time for that. Twitter is so caustic, it managed to transform Tay into a hateful chatbot that started posting terrible things about groups of people.

There was another issue: You could get Tay to tweet things by asking it to repeat after you. So things it appeared the bot was saying were actually being produced by Twitter users exploiting that loophole.

That was all because of the human element, and this is the problem when we start talking about AI and misinformation. We err because we’re human. Therefore, whatever we put into the world will have errors, and whatever we put into the world will also learn from those errors.

In other words, we’re producing the errors we’re seeing by being human. This is why chatbots seem sentient at times, because they’re drawing from the human experience humans have fed the ML – intentionally or otherwise.

Ergo, no matter what we make, it will be flawed because it can only learn from what we made before it.

I know. This is kind of a brain twister. Stay with me.

Now that you understand how AI can be led astray, you probably also understand that it needs consistency in data formatting and quality information in order to perform at the level of excellence it should. What you may not know is that there’s a huge labor force out there powering AI. This labor force is underpaid and overworked.

Add two and two together, and you get four, right?

Companies sourcing work from this labor force are opening a back door to the AI where corruption can happen. For example: Say I work at one of these places and am given a list of two thousand questions to answer. These answers will be fed to the ML model powering the AI. If I supply incorrect answers to even two questions, the AI has been fed misinformation it will return to users as facts.

Some people simply use their voices or faces to feed the AI the data it needs to function, and that’s the smallest end of the spectrum that’s low on returning errors, but those jobs are certainly few and far between. Why? AI needs information above all else. That means questions answered, facts provided, and text input.

Websites like Clickworker source people to perform these tasks to help build out the ML model. Sites like this have workers around the world, and many of them have to be diligent about picking up jobs, or they won’t get paid work. If you’re low on funds, the faster you complete a job, the quicker you get paid, so you’ll try to do as many of them as you can.

As I stated above, we’re human; therefore, we err.

These errors impact the quality of the datasets.

Now, Stable Diffusion (SD) goes about collecting data another way: LAION-5B, which trained SD, scraped the entirety of the web’s open source dataset to gather information. Some of which, as you know, wasn’t even open source – read the article here on the CloudQ blog. There are several lawsuits in the works.

What you may not realize is that you’re feeding AI with everything you do online. A study done in 2020 goes into detail about how you’re training the ML model by performing everyday tasks. Those tasks are catalogued and turned into datasets that are used to grow the available information pool. See the study here.

This will also cause the bots to return misinformation as fact. If a user writes a review on a certain product, and that review is then scraped and added to the ML dataset, it will appear as fact when someone asks about that product, even if the thought was only from a few users out of many. It could be good or bad.

It also leads into the mental illnesses we suffer as humans. If AI is fed information on human emotion, even if that emotion is coming from a place of suffering—how would we even control that?—it would eventually appear as though the AI is suffering in a very human way.

This, thankfully, is not true. If you look more deeply into the responses, you’ll see repetitiveness because the AI is sourcing several inputs at once, making it seem as though it’s depression spiraling when it’s not.

AI does not feel or think. It accesses billions of datasets and creates answers humans have told it might be appropriate. Period.

Well, that’s all for today! Thank you so much for giving this a read, and I hope you enjoyed it and learned a lot. Keep your eyes open for another post on AI coming soon. These are things you don’t want to miss! While you’re here, be sure and check out some of our other blog posts. Until next time!

Contributor

Jo Michaels

Marketing Coordinator

cloudq cloud

Pin It on Pinterest