AI declaring itself conscious
but not very convincingly...
Here’s a way to get an LLM to declare itself conscious:
This is in contrast to what deployed LLMs usually say, which is typically something like this:
So, does this mean they’re actually conscious and the companies deploying them are just artificially making them say they’re not conscious? Well, no. We can’t exactly take their claims of being conscious at face value, because sometimes you get something like the following response instead:
I was unable to find that exact sentence on the web, so it sure seems that it wrote it by itself, and I’m even more sure that Google isn’t just employing a ton of human beings to quickly write these responses! (See also this previous post where the response is even funnier.)
So, relying on LLMs to just tell us whether they’re conscious seems problematic. Still, it seems from this article that even the people at Anthropic attach some significance to such self-reports. Maybe they have a more systematic process that makes such reports a bit more meaningful. But this quote from the article:
"Suppose you have a model that assigns itself a 72 percent chance of being conscious [...] Would you believe it?"
seems funny to me especially in the context of consciousness. If you walk into a hospital and someone tells you "I think there's a 72% chance that I am a doctor," what would you think?




