In their 1995 paper Watanabe et al described how they trained pigeons placed into an operant conditioning chamber (Skinner box), in a conditional experiment where they were rewarded with food for pecking when presented with a painting of the correct artist. The trained pigeons were able to discriminate between paintings by Chagal and Van Gogh with 95% accuracy on seen (trained) works by the two artists, and were still 85% successful for unseen paintings by the artists. The result demonstrated that the pigeons did not simply memorize the pictures, but were actually able to extract and recognize the “style” of the artists by generalizing from what they had already seen to make predictions about previously unseen pictures.
Indeed, Watanabe has extended the work to show how pigeons can be trained to discriminate between good and bad artwork by children. He first asked human volunteers to separate artworks by children into two piles, one of good artworks and the other of bad artworks (labeling) and these piles were subsequently used to train the pigeons to perform the same classifications by giving them grain as a reward when they pecked on a good painting and no reward when they pecked on a bad painting.
In fact, after only 22 sessions (training epochs/iterations) the pigeons were able to distinguish between good and bad artworks. In order to eliminate the possibility that the pigeons were simply memorizing the artworks he also tested their ability to generalize from what they had previously been trained on by showing them previously unseen good and bad artworks. Watanabe went on to perform other tests where he scaled images down, showed images in grayscale and mosaicked images where the shape information was degraded and the pigeons were still able to distinguish between the good and bad children’s artworks.
The means by which these pigeons were trained to discriminate between artworks is the same means that is used to train so-called deep-networks and the accuracy obtained by those trained networks is analogous. In deep networks, the pigeons are replaced by computers and labelled data is provided through multiple training iterations. An error is calculated and fed back to gradually converge the network. This is through multiple training epochs to the correct categories using back-propagation. The error is then fed back to adjust the network weights gradually to produce the correct outputs in the same way grain was used to reward the pigeons.
Although we may be quick to praise each of the various doodles and paintings that our children make, we nevertheless do guide them toward creating certain kinds of artwork. We hire teachers and buy materials so that our children can create better, more expressive art. And, we do so despite the fact that the teachers’ own definition of “good” or “bad” artwork may be unstable, difficult to define, and even more difficult to explain. Yet, when we ourselves stand in front of a series of paintings at an art museum and we’re asked to pick the “good” art from the “bad,” we don’t have any clear answers either: we simply know it when we see it.
This process of so-called supervised learning represents the state-of-the-art in training deep-networks and is critically dependent on humans (or possibly animals if you can imagine battery chickens trained to classify images) to do the labeling.
So the question is – are you really afraid that an AI with the intelligence of a pigeon is going to take your job?