Zero-shot learning and the foundations of generative AI

This alternative to training with huge data sets has potential for business, but data science teams will need to spend time on research and experimentation.

Zero-shot learning and the foundations of generative AI

We may remember 2022 as the year when generative AI went from the labs to mainstream use. ChatGPT, a conversational AI that answers questions, went from zero to one million users in under a week. Image generation AIs DALL-E 2, Midjourney, and Stable Diffusion opened public access and captured the world’s attention with the variety and quality of images generated from short phrases and sentences.

I admit to having some fun with DALL-E 2. Here’s its rendition of two lost souls swimming in a fishbowl and Tim Burton depicts the agony of opening an unripe avocado.

“AI has generated headlines for projects such as self-driving vehicles like Tesla and Waymo, unbeatable game playing (think AlphaGo), and captivating art generation like DALL-E,” says Torsten Grabs, director of product management at Snowflake.

Many machine learning models use supervised learning techniques where a neural network or other model is trained using labeled data sets. For example, you can start with a database of images tagged as cats, dogs, and other pets and train a CNN (convolutional neural network) to classify them.

In the real world, labeling data sets at scale is expensive and complex. Healthcare, manufacturing, and other industries have many disparate use cases for making accurate predictions. Synthetic data can help augment data sets, but training and maintaining supervised learning models is still costly.

One-shot and zero-shot learning techniques

To understand generative AI, start by understanding learning algorithms that don’t rely on labeled data sets. One-shot and zero-shot learning algorithms are example approaches that are the foundation for generative AI techniques.

Here’s how ChatGPT defines one-shot and zero-shot learning:

“One-shot and zero-shot learning are both techniques that allow models to learn and classify new examples with limited amounts of training data. In one-shot machine learning, the model is trained on a small number of examples and is expected to generalize to new, unseen examples that are drawn from the same distribution. Zero-shot learning refers to the ability of a model to classify new, unseen examples that belong to classes that were not present in the training data.”

David Talby, CTO at John Snow Labs, says, “As the name implies, one-shot or few-shot learning aims to classify objects from one or only a few examples. The goal is for humans to prompt a model in plain English to identify an image, phrase, or text with success.”

One-shot learning is performed with a single training example for each sample, say a headshot of a new employee. The model can then compute a similarity score between two headshots, such as a photo of the person matched against the sample, and the score determines a sufficient match to grant access. One example of one-shot learning uses the Omniglot dataset, a collection of 1,623 hand-drawn characters from 50 different alphabets.

In zero-shot learning, the network is trained on images and associated data, including captions and other contextual metadata. One approach to zero-shot learning uses OpenAI’s CLIP (Contrastive Language-Image Pretraining) to reduce the dimensionality of images into encodings, create a list of all possible labels from the text, and then compute a similarity score matching image to label. The model can then be used to classify new images into labels using a similarity score.

OpenAI’s DALL-E uses CLIP and GANs (generative adversarial networks) to perform the reverse function and create images from text.

Applications of few-shot learning techniques

One application of few-shot learning techniques is in healthcare, where medical images with their diagnoses can be used to develop a classification model. “Different hospitals may diagnose conditions differently,” says Talby. “With one- or few-shot learning, algorithms can be prompted by the clinician, using no code, to achieve a certain outcome.”

But don’t expect fully automated radiological diagnoses too soon. Talby says, “While the ability to automatically extract information is highly valuable, one-, few-, or even zero-shot learning will not replace medical professionals anytime soon.”

Pandurang Kamat, CTO at Persistent, shares several other potential applications. “Zero-shot and few-shot learning techniques unlock opportunities in areas such as drug discovery, molecule discovery, zero-day exploits, case deflection for customer-support teams, and others where labeled training data may be hard.”

Kamat also warns of current limitations. “In computer vision, these techniques work well for image recognition, classification, and tracking but can struggle in high accuracy/precision-requiring scenarios like identifying cancer cells and marking their contours in pathology images,” he says.

Manufacturing also has potential applications for few-shot learning in identifying defects. “No well-run factory will produce enough defects to have large numbers of defect-class images to train on, so algorithms need to be built to identify them based on as few as several dozen samples,” says Arjun Chandar, CEO at IndustrialML.

Conceiving next-gen AI solutions

Data scientists may try one-shot and zero-shot learning approaches to solve classification problems with unlabeled data sets. Some ways to learn the algorithms and tools include using Amazon SageMaker to build a news-based alert system or using zero-shot learning in conversational agents.

Developers and data scientists should also consider the new learning techniques and available models as building blocks for new applications and solutions instead of optimized problem-specific models. For example, Chang Liu, director of engineering at Moveworks, says developers can leverage large-scale NLP (natural language processing) models rather than build ones themselves.

“With the introduction of large language models, teams are leveraging these intelligent systems to solve problems at scale. Instead of building an entirely new model, the language model only needs to be trained on the description of the task and the appropriate answers,” says Liu.

Future AI solutions may look like today’s software applications, with a mix of proprietary models, embedded commercial and open source components, and third-party services. “Achievements are within reach of almost any company willing to spend time defining the problem for AI solutions and adopting new tools and practices to generate initial and continuous improvements,” says Grabs of Snowflake.

We’ll likely see new learning approaches and AI achievements in 2023, so data science teams must continuously research, learn, and experiment.

Copyright © 2023 IDG Communications, Inc.