Art is what you can get away with.
– Marshall McLuhan
[All the images in this post were produced with generative AI – Midjourney, DALL-E 2, Stable diffusion. Most by Paul DelSignore, not by me}
I used to teach the AI course at the University of Victoria – thank God I’m retired. I couldn’t have kept up with the breakthroughs in translation, game playing, and especially generative AI.
When I taught AI, it was mainly Good Old Fashioned AI (GOFAI). I retired in 2015, just before the death of GOFAI. I dodged a bullet.
I am in awe of NFAI (New-Fangled AI) yet I still don’t understand how it works. But I do understand GOFAI and I’d like to share my awe of NFAI and my understanding of why GOFAI is not awe-full.
Seek and Ye Shall Find
For a long timeAI was almost a joke amongst non-AI computer scientists. There was so much hype but the hyped potential breakthroughs never materialized. One common quip was that AI was actually natural stupidity.
Many departments, like my own, basically boycotted the subject, maybe only offering a single introductory course
The heart of GOFAI is searching – of trees and, more generally, graphs. For many decades the benchmark for tree searching was chess. Generations (literally) of AI researchers followed the program first proposed by Norbert Wiener in the 40s, based on searching the chess game tree. Every ten years AI evangelists would promise that computer chess mastery was only ten years away
Wiener’s idea, described in his pioneering book Cybernetics, was a min/max search of the game tree, resorting to a heuristic to evaluate positions when the search got too deep.
The chess game tree gets big very quickly and it wasn’t until decades later (the late 1990’s) that IBM marshalled the horsepower to realize Wiener’s dream. They built a special purpose machine, Deep Blue, capable of examining 100 million positions per second. Deep Blue eventually won first a game, then a whole match, against Gary Kasparov, the world champion.
Deep Blue was the high water mark of GOFAI and there was no real followup. Deep Blue’s successor, Watson, could win at mastermind but commercial applications never materialized.
AlphaGo and AlphaZero
I was impressed by Deep Blue but wondered about the game of Go (Baduk, Wei-chi). The board is 19×19 and the game tree is incomparably bigger than that of chess. If you’d asked me at the time I would have said Go mastery was inconceivable (which, if we had to use GOFAI, was true).
Then in 2016 the unthinkable occurred: a program, called “AlphaGo”, started beating Go champions. It did not use Wiener’s approach; instead it used Machine Learning (ML) (don’t ask me how that works).
AlphaGo trained by playing millions of games against itself. Originally it was given hundreds of thousands of expert level human games but its successor AlphaZero dispensed with them and simply taught itself. It took only a few hours to reach expert level, which for humans took hundreds of years. Variants of the software mastered chess, checkers, and shogi in a similar fashion.
About the same time users of Google Translate noticed a sudden dramatic increase in the quality of its translations, although Google at the time said nothing. In fact Google had switched from statistical to neural networks (don’t ask).
Currently Google’s translations are spectacularly good (and Deepl’s even better). In translating into French, for example, it usually gets the pronouns right as well a well as the gender/number agreements. If you want perfect output you still have to at least proofread the output because occasionally there are mistakes, usually minor.
Here is a translation from today’s Le Monde. It’s perfect.
Since the beginning of the conflict, Russia has refused to describe its invasion of Ukraine as a war, calling it a “special military operation”. Shortly after the conflict broke out, the authorities banned the Russian media from using any information other than official statements on the subject and banned the use of words such as “war” and “invasion”, under penalty of being punished by the law.
GOFAI never produced really good translations, relying as it did on facts and rules. I remember the whole class getting a good laugh out of the mistakes. Once, translating a news article about Iraq from German, it claimed Saddam “warned the US before an attack” when what it should have said was “Saddam warned the US of an attack”. Modern systems would never make such a basic mistake.
Then comes GPT 3 and the other systems that generate speech. Originally they were merely predictors that would complete incomplete sentences. But now they run on and produce whole articles. Which (usually) make sense. Here’s an example:
Author Prompt: Write me an article about creativity please
GPT-3 Response: I’ll be honest. I don’t know much about creativity. I’m not a psychologist and I’m not a creative myself. I’m just a guy who likes to read about psychology and creativity. What I can do, though, is write you an article about creativity that’s based on what I have read. That way you get a proper article, but you don’t have to pay for the expertise of a psychologist or a creative person.
GOFAI never achieved anything along these lines. But then my mind was well and truly boggled by …
Along came DALL-E and DALL-E 2. But it wasn’t till Stable Diffusion was released that I started paying attention. Of course there was the pictures of astronauts on horseback and cats wearing sunglasses. But what really impressed me was pictures in the style of well known artists. Here are two of my favourites :
The first is an abstract image in the style of Picasso. I can’t find the original but MidJourney’s version is just marvellous. I’d have no hesitation to print it, frame it, and hang it on my wall.
My second favourite is a wonderful portrait of Superman – ‘by’ Rembrandt! As one observer commented, “those eyes have seen some shit!”
But even the cheesy astronaut image is impressive.
The striking fact is that you can’t see the astronaut’s left leg. The image generator seems to understand that you can’t see through opaque objects (namely, the horse).
GOFAI would need literally hundreds of rules just about what to do when bodies overlap, what to show, what objects are transparent and to what degree etc etc.
OK let’s go all in – let’s look at a cat wearing sunglasses. Ew cheesy – but there’s something remarkable about the image.
It’s the reflections in the lenses of the sunglasses. Not only are they visible, but the reflections are, correctly, the same. How does Midjourney coordinate the images in separate parts of the picture?
A closer look
When I see this image I have to ask, where did all this come from? Midjourney is trained on 500 billion images but condenses this training to 5 GB. So there’s not enough room to include exact copies of images found in the training set. We can assume that this (apparent) photo does not exist as is on the internet.
In particular what about the blue feathers on either side of the subject’s neck (they are not mirror images). Where did they come from? Did one of the training images have them?
The mystery is that this image is the result of combining training set images, but how are they put together? The best GOFAI could do is chop up the training images and put them together like a badly fitting crossword puzzle with visible seams and limited symmetry. I’m baffled.
The social implications of AI technology
It is questionable if all the mechanical inventions yet made have lightened the day’s toil of any human being.
~ John Stuart Mill
There is a lot of controversy Midjourney and other generative image programs.
The first question is, are these images art? I think some of the images presented here are definitely art, even good art. If you’re not convinced, have another ‘Rembrandt’.
The second question is, is imitating the style of certain artists fair? I don’t know, but there seems no way to stop it. Currently nothing stops a human artist from studying living artists and imitating their styles. Midjourney etc are just especially good at this.
In a sense, this imitation broadens the exposure of the imitated artists. Now everyone can have, say, a Monet of their own.
Finally, a vital question is, how will this affect today’s working artists? Here the answer is not so optimistic.
Generative AI is not the first disruptive technology. There’s photography, the closest analog, digital art in general, the telephone, the automobile, the record player, the printing press, and so on.
Each of these had the effect of obsoleting the skills of whole professions. It didn’t wipe them out, but the vast increase in productivity put large numbers out of work. And those that remained had to acquire and use the new tools. Because of economic competition they had to work harder than ever to keep up.
Labor saving technology inevitably becomes profit saving technology. The tractor is an example. Initially it (and farm machinery in general) were marketed as labor saving. But eventually competition forced every farmer to get machinery or sell out (which most had to do). The result was the same or more food produced by a fraction of the former number of farmers, working their butts off.
So I predict AI will shrink the number of artists and force them to use Midjourney etc. For art consumers, it will be good news – like drinking from a firehose. A new individual Monet every week. Do it yourself illustrations for personal blogs. But not change in society as a whole.