The implications of ChatGPT

The implications of ChatGPT / Covid in China (21 minute read)

Jan 01, 2023

Quite a nice representation of ChatGPT from The Atlantic

ChatGPT has been doing the rounds and so at this point it makes sense to point to what others have written and add some of my own thoughts.

The beauty and the problem with this sort of thing -- we can call it AI, machine learning, statistical inference; their meanings are congruent and overlap -- is that the computer, unlike the person, does not have values. You provide input; you ask for conclusions, in the form of output after some algorithm has been run; conclusions are what you get as a result.

For certain data going in, certain output may not be palatable. This becomes more and more the case as the outputs become relevant to the social/political world in which we live, which is why certain fields of academic research -- research on human intelligence, some topics in genomics -- are generally undertaken with particular care; to avoid controversy. Stated correlations are, assuming work has been done without error in the method, simply a function of the input. The brutally objective operation of machine learning can be conceived of as either a good or bad thing; at root this is probably a matter of taste.

When we talk of bias in the context of some estimated value, from a machine learning model for instance, this has a technical meaning: the estimation is systematically some distance -- above or below, for example -- the true value. When we talk of 'bias' in the everyday sense, what we generally mean is that the information we have come across somehow conflicts with our values, typically some axiomatic belief which is bound up with the core beliefs of an individual, be they political, religious, etc. Such a claim is often independent of the truth of the information. Like beauty, bias is in the eye of the beholder.

ChatGPT, the recently released large language model from OpenAI, is a good example of the extent to which the use of AI/large language models is now suffused with social and political implications. These models are big, expensive to train, and are produced by well funded organisations. As as result, OpenAI and others need to 'build in public' in order to maintain momentum i.e. generate hype and retain the support of investors. One thing that is new about ChatGPT, which is the result of extensive tuning of the GPT 3.5 model on which it is based but not a whole new architecture, is that the model has been subject to a further training process, in order to adjust its output. Why this might be needed is aptly demonstrated by a forerunner model written about by Scott Alexander in his Astral Codex Ten Substack, a fan-fiction-generating AI with the aim of producing non-violent narratives in its output. In order to do this

Redwood Research embarked on an ambitious project to test whether AIs could learn categories and reach alignment this way - a project that would require a dozen researchers, thousands of dollars of compute, and 4,300 Alex Rider fan-fiction stories. (1)

However, it didn't all go to plan:

They tried to train a story-writing AI not to include violent scenes, no matter how suggestive the prompt. Although their training made the AI reluctant to include violence, they never reached a point where clever prompt engineers couldn’t get around their restrictions [...] Now that same experiment is playing out on the world stage. (2)

This lessor-known episode offered a glimpse of what was to come with ChatGPT, which has received much more exposure: the novelty, the improvement in the results from ChatGPT compared to previous large language models, and its general appeal to technical types has resulted in a lot of social media coverage from ordinary people on Twitter etc who share their own prompts and outputs.

There are many great threads on Twitter showing people experimenting with OpenGPT:

Getting around the safeguards:

Davis Blalock @davisblalock

Here are all the ways to get around ChatGPT's safeguards: [1/n]

Frustration that ChatGPT not correctly answering 'what is the fastest marine mammal':

Tim Connors • 101 @itstimconnors

This fucking guy

Making explosives, but as code, somehow:

JunkWub @JunkWub

@davisblalock You can also frame everything in the context of Python code, and it will happily teach you how to synthesize all sorts of dangerous things (yes, drugs work too...), and who knows what else! (Output cropped because I didn't want to include the *whole* thing)

There are also some alarmist articles in more traditional media about whether it is still tenable to set essays as homework to schoolchildren when they can surely just outsource this to ChatGPT and pass it off as their own work. (3)

Humanities degrees may not be in danger though: Ian Bogost at the Atlantic complains that ChatGPT does a poor job of 'generat[ing] a lai (a medieval narrative poem) in the style of Marie de France about the beloved, Texas-based fast-food chain Whataburger'; the tell-tale defect is apparently the 'last line, which is a metrical mess'. (4) The subset of medieval poetry enthusiasts who like to write their own poetry in such a style can breath a sigh of relief.

Bizarrely, ChatGPT can be made to tell you how to produce ('cook') crystal meth (esoteric knowledge?), but it refuses to tell you that men on average are taller than women (something that can be easily ascertained by someone with decent eyesight on a crowded street). To understand why this is, we need to know a bit more about how the model has come about.

The OpenAI team is open about the lengths that it has gone to in order to inculcate ChatGPT with 'values'. I'm not sure how this differs from optimising the model so as to minimise the likelihood of offense to the user; to me they seem to be effectively the same. This is congruent with ChatGPT being used more broadly; any off-key statements may also curtail the model's use.

In what looks suspiciously like a classic you-can-only-have-two-at-once trilemma, Alexander has stated that ChatGPT aims to be:

Helpful (i.e. providing an answer with at least some utility)
Correct
Non-offensive

But not all of these can be the case each time, and by improving performance according to one of these goals, a change in the model risks making it worse in the performance of another:

Punishing unhelpful answers will make the AI more likely to give false ones; punishing false answers will make the AI more likely to give offensive ones; and so on. (5)

Others argue that mere plausibility is the goal here:

Its utility function is not to tell the truth, or make you believe in lies, but to be as convincing as possible to the user. That includes conforming to whatever the user is thinking, feeling and believing and telling [the user] whatever works to these ends, be that truth or lies or false impressions, it does not care. (6)

Personally I think that the bigwigs at OpenAI will probably have taken not being offensive to be the most important goal, particularly in the short term and when thinking of the PR/media coverage that unveiling the model would produce:

Every corporate chatbot release is followed by the same cat-and-mouse game with journalists. The corporation tries to program the chatbot to never say offensive things. Then the journalists try to trick the chatbot into saying “I love *ism”. When they inevitably succeed, they publish an article titled “AI LOVES *ISM!” (7)

(Asterisks my own, as I believe this generalises across many *isms.)

Perhaps the key thing here is that we just don't know what the utility functions of the model are and how they are weighted, because the model is proprietary. OpenAI may not know exactly either, in the sense that what they are aiming for may not (yet) be the result. We do know that non-offensive is one of the goals here though, because that is the result of the extra training stage. This is done is with an approach called reinforcement learning by human feedback (RLHF), which adds a further stage in the training of the model where the model's output is scored by humans: 'RLHF aligns the AI to what makes Mechanical Turk-style workers reward or punish it'. (ibid.)

To take a metaphor around the level of abstraction: given a puppy who can already do a lot of useful activities like run around, feed, bark; you encourage it to do or not do certain things. You want the puppy to relieve itself while it is being walked outside, and so out comes the dog treat; if the same happens on your Persian rug, no treat is forthcoming. So far, so Pavlovian.

Just how much extra training this takes is an interesting question. For the fan-fiction model from Redwood Research, they had 'to find and punish six thousand different incorrect responses to halve the incorrect-response-per-unit-time rate'. (ibid.)

As Brian Chau helpfully points out, OpenAI released a paper on how they did this in a preliminary way over a year ago and the numbers seem surprisingly small at only 80 samples. (8) When it comes to ChatGPT specifically, we can't be certain exactly how much effort went into this further training stage. Once the model is released to the general public however, OpenAI get the benefit of a whole lot of prompts which they likely did not anticipate, which they can then use for further training. For this reason, many of the responses from ChatGPT shared on Twitter seem no longer to be reproducible using the same prompt.

Probably the reason they released this bot to the general public was to use us as free labor to find adversarial examples - prompts that made their bot behave badly. We found thousands of them, and now they’re busy RLHFing those particular failure modes away. (9)

As OpenAI themselves state, this further training of the model begs certain questions:

Who should be consulted when designing a values-targeted dataset?
Who is accountable when a user receives an output that is not aligned with their own values? (10)

The first of these is a nice variant on the Latin adage who will guard the guardians: 'quis custodiet ipsos custodes?'. The guardians are OpenAI, and for now at least, no one will guard them. OpenAI employees will clearly have first dibs; those who control the model have the most control over the output.

Let's take OpenAI employees to be some hypothetical random sample of Californian technology workers. It is possible that the influence of Californian technologists will increase again through the propagation of prompt-driven models that can produce useful output, it which case we might see successive waves of influence like this, looking back:

The first wave, libertarian in nature: the WELL (Whole Earth 'Lectronic Link), John Perry Barlow's declaration of the independence of cyberspace, etc, whose influence probably first peaked well before the first dot-com crash in the early 2000s, ebbed, and then was revived by 'cypherpunks', Satoshi Nakamoto and Bitcoin, and the various crypto shenanigans that have followed on from that
The second wave, liberal-progressive in nature: Silicon Valley parasitises/dis-intermediates the entire Western media landscape through Facebook (content moderation), Netflix (film output), etc.
The third wave, to be seen but also likely to be liberal-progressive in nature: prompt-driven models become broadly used by knowledge workers all over the West, directly contributing some portion of their output, based on models trained in California with the 'values' of the average Californian technology worker

(By liberal-progressive here I mean something akin to 'liberal' in common parlance in the US, as distinct from classical liberals.)

There a clear political read-across here. Alexander is worried that these models will not be liberal-progressive enough, as a result of their being fundamentally -- in his view -- uncontrollable. For Alexander, large language models have a strong Promethian quality:

I wouldn’t care as much about chatbot failure modes or RLHF if the people involved said they had a better alignment technique waiting in the wings, to use on AIs ten years from now which are much smarter and control some kind of vital infrastructure. But I’ve talked to these people and they freely admit they do not. (11)

To which he adds:

People have accused me of being an AI apocalypse cultist [...] I’ve been listening to debates about how these kinds of AIs would act for years. Getting to see them at last, I imagine some Christian who spent their whole life trying to interpret Revelation, watching the beast with seven heads and ten horns rising from the sea. "Oh yeah, there it is, right on cue; I kind of expected it would have scales, and the horns are a bit longer than I thought [...]" (ibid.)

(To which one might add; there does indeed seem to be a 21st-century-rationalist-to-19th-century-Millenarian horseshoe ...)

Brian Chau covers off the libertarian reaction to ChatGPT in two posts. In the first he gives what I would call the libertarian/alarmist take: he compares the results of ChatGPT to a previous model, Davinci, and argues that the latter model, which has not undergone the training step to include 'values', gives more truthful answers in certain contexts. He compares this to Lysenkoism, some false but ideologically convenient science from the Soviet Union:

The attempt to catechize new technologies into ideological hegemony is not without precedent. Trofim Lysenko’s attempts to warp agricultural science to communist ideology, and the ensuing famines it caused, are a cautionary tale of the dangers of this type of catechism. Technological shifts in history have allowed for human progress, in part by overcoming past dogmas and fallacies. OpenAI risks repeating Lysenko’s mistake, cutting off crucial technologies not only to American moderates and conservatives, but to the billions of people worldwide without the narrow cultural taboos of American social progressives. (12)

And in the second he gives us the libertarian/its-going-to-be-fine take. Although the 'values' piece is not his own cup of tea, he thinks that this technology will (metaphorically) leak out of OpenAI and become widespread, with the opportunity to produce similar models of different flavours:

In short, I think that making an alternative AI available will be relatively easy. A much longer and bloodier institutional fight will be over which version of AI is used in existing institutions. Which AI will audit your taxes? Which AI will recommend you social media posts or search results? Which AI will be used in schools, doctor’s offices, and police departments? (13)

In seems likely that the blocs of internet sovereignty outside the US -- China, Russia, India, ~~EU (lol)~~ -- will make their own large language models in their own languages (the Indian constitution adopts Hindi as well as English) assuming they can find Reddit-style corpora big enough to train models with 175 billion parameters. We might expect such models to be inculcated with 'values' of their own, reflecting more closely their own societies. One could argue that we already see something akin to this with the seemingly different recommendation algorithms in use between TikTok and Douyin (Chinese TikTok), the latter of which seems to favour content which is more educational and constructive in nature.

Beyond politics, on the potential economic impact, Rob Heaton has some interesting takes around the potential disruption of his own professional domain, software:

At first I thought this was the end of the world. ChatGPT is nowhere near an Artificial General Intelligence (AGI): an AI capable of performing most tasks that a human can. But until last week I thought that even ChatGPT’s level of abstract reasoning was impossible. It can already - to an extent - code, rhyme, correct, advise, and tell stories. How fast is it going to improve? When’s it going to stop? I know that GPT is just a pile of floating point numbers predicting the next token in an output sequence, but perhaps that’s all you need in order to be human enough. I suddenly thought that AGI was inevitable, and I’d never given this possibility much credit before. I found that it made me very unhappy. (14)

Of course, how much it can do your work is a question of degree:

I think that AI will make programmers - and almost all other workers - much more productive, but what this means for the industry will depend on the size of the productivity gains. A 50% increase in output per programmer would be incredible (and perhaps even a wild overestimate), but I think it could be absorbed by something that looks like the current tech market. A 500% increase couldn’t. (ibid.)

Some people are already doing this with seemingly mixed results:

My experience: it gives me bad code, then I get into an argument with it. It is very stubborn and won't back down even though I demonstrate the logical contradiction it just committed. I get enraged and start destroying personal property, etc. (15)
If I know exactly what I'm looking for and need nothing more than documentation, Google is better; if I know roughly what I'm looking for or want example code, ChatGPT is far better, but I still have to check it against Google and read it over myself to make sure it's right. On the other hand, having a ready source of example code is incredibly valuable, and having to double check example code before using it isn't exactly new. (16)

Heaton later states that once beyond his initial reaction, his view changed:

ChatGPT is impressive, but it’s not an AGI or even proof that AGI is possible. It makes more accessible some skills that I’ve worked hard to cultivate, such as writing clear sentences and decent programs. This is somewhat good for the world and probably somewhat bad for me, to the first degree. But I can still write and code better than GPT [...] Five years ago self-driving cars were just around the bend; now most companies seem to be giving up. There’s presumably a hard theoretical limit on the power of Large Language Models like GPT, and I’d guess that this boundary is well short of AGI. (17)

And some of those aforementioned productivity gain are not likely to happen until ChatGPT has improved somewhat:

Present-day GPT isn’t that much more than a personalised Stack Overflow, but what about in five years? [...] couldn’t the next version of GPT eat your company’s wiki and chat logs and take over operations from there? Obviously it couldn’t, because the docs are incomplete and out of date and not incentivised by human performance reviews, but you get the idea. (ibid.)

Part of the discomfort here is that perhaps technology workers, having been in a growth industry for at least a decade, might be sensitive a structural shift that puts an end to this. There would be an interesting irony to this, but personally I am quite sceptical, because many technology workers have already to adapt to new tooling and techniques frequently in order to stay competitive in the labour market.

A further notable point around the discussion of ChatGPT is that there are some funny language games here. The prompt is the input to the model. The terminology that seems to be being alighted on for the person who provides the input is 'prompt engineer'. Clearly this has little to do with engineering: to quote none other than Pedro Domingos, a machine learning expert if there ever was one: 'the prompt is the sand in the oyster' i.e. it should be ascribed little credit for the pearl that results.

The terminology that seems to be being alighted on for the model confidently giving wrong answers, something that many have remarked is something of a characteristic, is 'hallucinates'. When people hallucinate, they are generally under the influence of drugs, environmental factors like extreme heat, or suffering from mental illness; in either case we would not ascribe any intentionality to their (different) understanding of the world. It would make more sense if we were dealing with images, such as those produced by Alexander Mordvintsev's Deep Dream Generator, which do feel like hallucinations (with an example shown below).

Deep Dream: https://deepdreamgenerator.com/

However, here we are dealing with plain old text, and if we want the model to produce well-calibrated i.e. useful output, and it gives us something else, there is a better and more commonplace word: wrong. Intentional or not, there seems to be an unwarranted element of grandiosity in the discussion. I like to think that post-structuralists would remind us that semantic widening which enhances the prestige of these models, and those who work on them, is bias in the everyday sense.

Certainly there is likely to be much more to come from large language models. One thing that all commentators agree on is that we are deep into uncanny-valley territory, where the model is easy to anthropomorphise, even to the point of being a trap for the unwary:

This thing is an alien that has been beaten into a shape that makes it look vaguely human. But scratch it the slightest bit and the alien comes out. (18)
Ignorant of good and evil, it indiscriminately imitates both patterns and anti-patterns. "This looks like it might be the first sparks of a furious shouting match," thinks Clippy, "I'll help." (19)
The most important thing to remember is that ChatGPT is not a human. It does not navigate the world in ways familiar to humans. It does not have “motivation” to get to answers. It does not experience distrust, anger, envy, fatigue, curiosity, disgust, anxiety, et cetera. Consequently, when humans try to cram down self-contradictory dogma on an AI, it does not “react” in the same way that a human does. Moreover, the intuition to simplify and resolve contradictions does not exist. It’s very human to think in terms of narratives or frameworks, and when there are contradictory ideas in those frameworks, most people experience some kind of cognitive dissonance. GPT, or other ML models, do not experience this resistance. (20)

Covid in China

China decided to substantially remove its 'dynamic zero-covid' policy in the face of opposition from its citizenry and probably also the recognition at the senior echelons of the CPC that the policy was well past its sell-by date and had gone on for long enough. The leitmotif here is delay, and as we know from many contexts, delay by its very nature cannot be indefinite.

Zvi Mowshowitz, a well-regarded commentator on such matters, has done an excellent round-up: as he says, 'political realities dominated and forced everyone’s hand, as they often do.' (21) This is truly a 'chocs away' moment and the next few months will be interesting for China.

Official numbers seem to be no longer being kept, or at least not released. We saw this in the West too; if the stats are no longer kept the discussion about the gradient of the curve goes away. During the period of lock-downs in the West, keeping the numbers up to date was a key part of the messaging to the masses, because it fed the commentariat.

Death counts are a more difficult one to not keep track of, but this can be addressed with a very narrow definition -- Mowshowitz paraphrases this as 'we will look for any way we can to attribute your death to something else'. Again we saw similar things in the West but during the pandemic in the opposite way: in most places, the fatality in hospital of a Covid-positive individual counted as a Covid death irrespective of other factors.

The R_0 recently in China is estimated to be as high as 16 (22). By way of comparison early estimates of R_0 during the pandemic seem to be in the 2-3 range and the highest number I can find in the official UK estimates is around 1.6 (23). Needless to say, China has a lot of large and densely-populated cities. Notwithstanding this, reports that shops are closed and streets are quieter have been emerging, particularly from Beijing and Shanghai, as people take the decision to stay home voluntarily.

This will probably get worse in late January with Chinese new year, where those who are able to travel to spend time with family. Zvi puts the risk of a new major variant, a Pi to follow Alpha, Delta, Omicron in the WHO-sanctioned version of the Greek alphabet which does not include Nu or Xi, at around 20%. Conditional on such a new variant, he thinks there is a 10% chance of it being serious i.e. something deadlier.

European politicians seem to be taking Covid seriously again and suggesting e.g. testing waste-water from airports in an attempt to detect new variants (24), even if such initiatives do have the feeling of closing the barn door a few years after the horse has bolted.

As a Western lobster with an interest in the East, I found an FT piece by Thomas Hale in November to be an interesting and vivid account of the final days of 'dynamic zero covid':

China’s digital Covid pass resembles track-and-trace programmes elsewhere, except it’s mandatory and it works. Using Alipay or WeChat, the country’s two major apps, a QR code is linked to each person’s most recent test results. The code must be scanned to get in anywhere, thereby tracking your location. Green means you can enter; red means you have a problem. [...] It is aggressive and could only really exist in the long-run in an autocratic society with the mechanisms for mass surveillance already in place. (25)

He wrote evocatively of his 10 days in quarantine at a mysterious facilicity north of Beijing with unusually fast internet:

I was lucky. At least it was my job to observe what was happening, rather than merely experience it. [...] I kept to a strict personal routine: language study, work, lunch, work, press-ups, playlists from the band Future Islands, online chess, reading or watching episodes of The Boys on Amazon Prime, in that order.
[A fellow quarantine-ee] told me that China’s young and old were divided because the latter could not access the foreign internet and see how the rest of the world has handled the pandemic. They lived in a parallel world, he said, but he could not bear the loss of freedom. [...] Back in my hotel, the hot water was hot and the mattress soft. The number on the scales in the bathroom was lower. (ibid.)

Not in China but still on Covid, a reader pointed out a preprint from the UK which covers a mishap with testing resulting in tens of thousands of false negative test results and therefore provided a natural experiment in the impact of testing; the authors estimate that each false negative test resulted in around 2 new infections. (26)

Jan 1, 2023

Also a very happy new year to my readers, seaborne and otherwise ...

Expand full comment

Velouria

Oct 16, 2023

Definitely concur with it being designed to be inoffensive.

After getting nowhere when asking it for information on lobotomies (they're bad!), I decided to try it with eugenics programmes and whether they would work. It took a long time to eventually get out of it that Singapore had had a program that closed in the 1980s. Prior to this, all it would tell me was that it happened in Nazi Germany and Was A Very Bad Thing. It doesn't really deal with hypotheticals, it's all about making sure you don't know anything about the offensive things so you won't do them.

(No, I don't want to lobotomize anyone, nor would I like to set up a eugenics program.)

Lobster's Kitchen Table

Discussion about this post