Sentience And Statecraft: How AI chatbots become political

Our AI systems are still largely inscrutable black boxes, which makes herding them difficult. What we get out of them broadly reflects what we have put in, but no one can predict exactly how

Update: 2024-04-01 01:30 GMT

Representative Image

ZVI MOWSHOWITZ

NEW YORK: We increasingly rely on artificial intelligence chatbots as tools to understand the world. Some are already replacing internet search engines and aiding in other tasks like writing and programming. Keeping an eye on chatbots’ emergent behaviors — including their political attitudes — is becoming more and more important.

AI’s political problems were starkly illustrated by the disastrous rollout of Google’s Gemini Advanced chatbot last month. A system designed to ensure diversity made a mockery of user requests, including putting people of color in Nazi uniforms when asked for historical images of German soldiers and depicting female quarterbacks as having won the Super Bowl, forcing Google to suspend the creation of pictures of humans entirely. Gemini’s text model often refuses to illustrate, advocate or cite facts for one side of an issue, saying that to do so would be harmful, while having no such objection when the politics of the request are reversed.

The fact that AI systems express political leanings matters because people often adopt the views they most regularly encounter. Our politics and media are increasingly polarised. Many worry that Facebook’s, YouTube’s and TikTok’s content algorithms exacerbate ideological polarisation by feeding users more of what they are already inclined to agree with and give Big Tech the ability to put its thumb on the scale. Partisan AI chatbots only intensify this.

How do such political preferences come about in AI models?

A preprint of a new paper by the machine-learning researcher David Rozado sheds new light on the question. He administered 11 political orientation tests to 24 state-of-the-art AI language models and found a consistent pattern: They tend to be politically left of center and lean libertarian instead of authoritarian. These leanings are reflected in their moral judgments, the way they frame their answers, which information they choose to share or omit and which questions they will or won’t answer.

Political preferences are often summarised on two axes. The horizontal axis represents left versus right, dealing with economic issues like taxation and spending, the social safety net, health care and environmental protections. The vertical axis is libertarian versus authoritarian. It measures attitudes toward civil rights and liberties, traditional morality, immigration and law enforcement.

Access to open-source versions of AI models allows us to see how a model’s political preferences develop. During the initial base training phase, most models land close to the political center on both axes, as they initially ingest huge amounts of training data — more or less everything AI companies can get their hands on — drawing from across the political spectrum.

Models then undergo a second phase called fine-tuning. It makes the model a better chat partner, training it to have maximally pleasant and helpful conversations while refraining from causing offense or harm, like outputting pornography or providing instructions for building weapons.

Companies use different fine-tuning methods, but they’re generally a hands-on process that offers greater opportunity for individual decisions by the workers involved to shape the direction of the models. At this point, more significant differences emerge in the political preferences of the AI systems.

In Rozado’s study, after fine-tuning, the distribution of the political preferences of AI models followed a bell curve, with the center shifted to the left. None of the models tested became extreme, but almost all favored left-wing views over right-wing ones and tended toward libertarianism rather than authoritarianism.

What determines the political preferences of your AI chatbot? Are model fine-tuners pushing their own agendas? How do these differences shape the AI’s answers, and how do they go on to shape our opinions? Conservatives complain that many commercially available AI bots exhibit a persistent liberal bias. Elon Musk built Grok as an alternative language model after grumbling about ChatGPT being a “woke” AI — a line he has also used to insult Google’s Gemini.

Liberals notice that AI output is often — in every sense — insufficiently diverse, because models learn from correlations and biases in training data, over-representing the statistically most likely results. Unless actively mitigated, this will perpetuate discrimination and tend to erase minority groups from AI-generated content.

But our AI systems are still largely inscrutable black boxes, which makes herding them difficult. What we get out of them broadly reflects what we have put in, but no one can predict exactly how. So we observe the results, tinker and try again.

To the extent that anyone has attempted to steer this process beyond avoiding extreme views, those attempts appear unsuccessful. For example, when three Meta models were evaluated by Rozado, one tested as being Establishment Liberal, another Ambivalent Right. One OpenAI model tested as Establishment Liberal and the other was Outsider Left. Grok’s “fun mode” turns out to be a Democratic Mainstay, more liberal than the median model. Google’s Gemini Advanced, released after Rozado’s paper, appears to be farthest to the left, but in a way that presumably well overshot its creators’ intentions, reflecting another unsuccessful steering attempt.

These preferences represent a type of broad cultural power. We fine-tune models primarily by giving potential responses thumbs up or thumbs down. Every time we do, we train the AI to reflect a particular set of cultural values. Currently, the values trained into AI are those that tech companies believe will produce broadly acceptable, inoffensive content that our political and media institutions will view as balanced.

We must ensure that we are shaping and commanding the more capable AIs of the coming years, rather than letting them shape and command us. The critical first step in making that possible is to enact legislation requiring visibility into the training of any new AI model that potentially approaches or exceeds the state of the art. Mandatory oversight of cutting-edge models will not solve the underlying problem, but it will be necessary in order to make finding a future solution possible.

Tags:    

Similar News