Few/Zero-Shot Learning
Generative AI (GenAI)What is Few/Zero-Shot Learning?
Few-shot and zero-shot learning describe an AI model's ability to perform new tasks with very few examples (few-shot) or no examples at all (zero-shot). Traditional machine learning required thousands or millions of labeled examples to learn a task, but modern large language models can generalize from remarkably little information. In zero-shot learning, you simply describe the task in natural language, such as 'classify this movie review as positive or negative,' and the model performs it without ever having been explicitly trained on movie review classification. In few-shot learning, you provide a handful of examples (typically two to five) showing the pattern you want, and the model picks up on it immediately. This ability emerged as language models grew larger and is one of the most surprising and valuable properties of modern AI. It means businesses can deploy AI for new tasks almost instantly without collecting training data or building custom models for each use case.
Technical Deep Dive
Few-shot and zero-shot learning in the context of large language models refer to in-context learning capabilities where models perform tasks from minimal or no task-specific examples provided in the prompt. Zero-shot learning requires only a natural language task description, relying on knowledge acquired during pretraining to generalize to novel tasks. Few-shot learning (k-shot, typically k=1-32) provides input-output exemplars in the prompt, enabling the model to infer the task pattern through implicit Bayesian inference over the in-context examples. GPT-3 (Brown et al., 2020) demonstrated that these capabilities emerge with sufficient model scale and diverse pretraining data. Theoretical frameworks include transformer in-context learning as implicit gradient descent (Akyurek et al., 2022) and Bayesian inference (Xie et al., 2021). Performance is sensitive to example selection, ordering, and formatting. In computer vision, zero-shot classification via CLIP aligns image and text embeddings to classify images into unseen categories. Meta-learning approaches like MAML and prototypical networks provide complementary few-shot solutions outside the LLM paradigm.
Why It Matters
Few and zero-shot learning is why you can ask ChatGPT or Claude to perform a completely new task, like classifying emails or translating jargon, without any special training, making AI instantly useful for novel problems.
Related Concepts
Part of
- Large Language Models (LLM) (related tech/concepts)
Connected to
- Large Language Models (LLM) (related tech/concepts)