“One morning, I shot an elephant in my pajamas. How it got into my pajamas, I’ll never know.”
-Groucho Marx (as Captain Spaulding), “Animal Crackers” (1930)
Given how fast and powerful artificial intelligence (AI) computing technology has become over the past two decades, it’s humbling to think that the machines still can’t quite grasp the wit of Groucho Marx and a joke he made almost a century ago.
The challenge for the computers comes from trying to decipher and interpret the intent of human languages that use pronouns like “it,” “he,” and “she.” While this may seem like a staggeringly simple task to any red-blooded human being, for years, AI tools had to use a surprising amount of computational power trying to relate pronouns—both written and spoken—to the words closest to them and then expanding outward one by one to correctly derive the meaning of what was being said.
Human language was the first frontier of use for promising and quickly developing AI tools known as transformers, which can decipher the relationships between an entire grouping of input data (words, calculations, pictures, etc.) rather than parsing everything bit by bit up close. In topographical terms, transformers gain intelligence by looking at pieces of the whole map first and then finding patterns and commonalities rather than starting small at the neighborhood level and zooming out.
It’s a method that has shown tremendous promise in many areas of AI application, with the potential to greatly reduce the time and computing resources needed to complete tasks.
Dr. Bruce Porter, SparkCognition’s chief science officer, said transformers represent an evolution in the architecture of neural networks that use increasingly powerful computing technology to find relationships between data throughout the sample to uncover larger meanings and patterns. By quickly training on comparatively small data samples, they can then use as little as 5% of the whole data set to reconstruct or make predictions of future performance.
Using the elephant/pajamas example, Porter said that starting by looking at the 100 words surrounding a pronoun gives much better insight than expanding outward gradually, with the effect of “putting the AI on steroids.”
“What transformers do is they take really big windows when studying data. And then based on what’s seen in that huge window, you get really accurate predictions and insights of what that word is actually saying and referring to,” he said.
Aside from making sense of a 1930s-era joke, Porter said transformer technology has been one of the most important pieces in the development of Generative Pre-trained Transformer 3 (GPT-3), the advancement that allows for AI to generate human-like text at great speed. Another realm where the technology has shown tremendous power is in interpreting images and video, with the ability to accurately recreate blurry or incomplete photos from only 5% of the total pixels, or creating realistic images of completely fabricated people based on photos of movie stars and celebrities.
Two idiots in a room
Further into the advancement of AI tech and machine learning, Porter is excited about the proven potential of generative adversarial networks (GANs) that pit two opposing models against each other to learn the correct patterns and traits from initial batches of data designed for training purposes. Tasked with de-blurring an image, a GAN framework uses a generator model that tries to correct or create an accurate image and discriminator model that tries to judge the authenticity or accuracy of the generator’s output. Since it’s only possible for one side to win in these tests run at supercomputer speeds, accurate outputs can be created from very small amounts of data.
Thinking of the competing models as “two idiots in a room, each trying to learn better than the other,” Porter said the results in recent years have shown how effective GANs can be as another methodology to test and employ in addition to transformer frameworks.
“In short order, you have a human-like judge that’s gotten ever better at distinguishing what the other idiot drew from what the gold standard drew, and you also get a drawer who began as an idiot who can draw as good as the gold standard,” he said.
At SparkCognition, both frameworks are being tested or used currently, with transformer technology under examination for possible use in predicting behavior inside turbines based on limited historical sensor data.
As advancements in the architecture of neural networks continue to move forward quickly, Porter said he’s excited to see what comes next even if he’s shy about predicting what will be possible with each new advancement.
After all, he admits to regularly being surprised at the ways researchers have managed to move technologies like transformers past their beginnings in natural language into areas that continue to expand the usability of machine learning.
“I’ve been doing this for decades and I still consistently underappreciate how far the technology can be applied, though I think the whole field has the same problem,” he said. “At the beginning, transformers were really cool at resolving pronouns, but would I have guessed that the next step is that now you can generate stories like GPT-3 is doing, or take it the next step further and better resolve computer images? No, it’s moving too quickly and in too big steps to really see these things happening or to make any predictions five years out.”