Glossary
AI language. Briefly explained.
Over a hundred terms that come up in AI conversations — without jargon. For leaders who want to keep up without becoming developers.

- Accuracy
- Share of correct answers on a test set. A quick yardstick — but blind to nuance, context, and rare failures.
- Adversarial Attack
- Deliberately manipulated input that misleads a model — for example a prompt that bypasses safety rules or an image that fools a classifier.
- Agent
- An AI system that plans multiple steps, uses tools and judges intermediate results — rather than just answering once.
- AGI
- Artificial General Intelligence. Hypothetical AI matching human cognitive ability across any task. Does not exist — a goal debate, not a product.
- Algorithmus
- A sequence of steps a computer follows to turn an input into a result — the term goes back to al-Khwarizmi in the 9th century. In classical software, prescribed by humans and traceable; in AI models, learned from data and for that reason often hard to see through.
- Alignment
- The effort of training a model to follow human intent and values — not just the literal prompt.
- API
- Application Programming Interface. The endpoint through which software calls an AI model. Usually priced per token.
- Attention
- Mechanism that makes models focus on relevant parts of the input while generating. Core of every transformer architecture.
- AutoML
- Automated training and tuning of models. Lowers the entry barrier — but does not replace domain understanding.
- Backpropagation
- The standard way neural networks learn: errors at the output are propagated back through the network, weights get adjusted.
- Batch
- A group of training examples passed through the model together. Batch size affects learning quality and memory needs.
- Benchmark
- Standardized test used to compare models — MMLU, HELM, HumanEval. A useful yardstick, but no proxy for real-world value.
- Bias
- Systematic distortion in models — produced by imbalanced training data. Not a bug, but a mirror of the world the data comes from.
- Black Box
- Models whose internal decision logic is not directly inspectable. The core challenge behind explainability and governance.
- Chain-of-Thought
- A technique where the model "thinks aloud" in intermediate steps before answering. Often more accurate — but slower and more expensive.
- ChatGPT
- OpenAI’s chat assistant that triggered the public AI breakthrough in 2022. Today a synonym for conversational AI, though only one of many products.
- Claude
- Anthropic’s model family. Known for long context windows and a rigorous safety-training approach (Constitutional AI).
- Constitutional AI
- Anthropic’s training approach: the model is given a "constitution" of principles and learns to critique itself against them.
- Context Engineering
- The deliberate shaping of context — which role, rules, and data the model receives. Often matters more than the model itself.
- Context Window
- The maximum amount of text a model can hold "in mind" at once. Beyond it, older content is forgotten.
- Copilot
- An AI assistant that supports professionals in the background — code, text, email. Augments rather than replaces.
- Copyright
- Unresolved dispute: who owns rights to AI-generated output? And: may a model train on copyrighted data? Legal terrain still shifting.
- Corpus
- The body of text a model was trained on. Corpus quality and diversity shape what the model can do — and where its bias comes from.
- Data Governance
- How a company ensures data is handled correctly, safely and in line with rules — roles, processes, accountabilities plus compliance with regulations such as GDPR or Sarbanes-Oxley. In AI projects, often what should have been clarified beforehand.
- Data Labeling
- Manually annotating training data — labeling images, categorizing texts, marking errors. Time- and cost-intensive, but decisive for quality.
- Dataset
- A structured data collection used to train, evaluate or refine a model. Quality and representativeness are prerequisites for success.
- Dataset Shift
- Models are trained on specific data; reality in production drifts over time — customer behaviour moves with the seasons, new terms appear, markets turn. Predictions then quietly drift off, often without anyone noticing. Models age too.
- Deep Learning
- Subfield of machine learning using deep neural networks — many layers, millions to billions of parameters. Basis of every modern AI breakthrough.
- Diffusion Model
- Model type that generates images by iteratively denoising. The technique behind Midjourney, Stable Diffusion, DALL-E.
- Distillation
- A small model learns from a larger one — teacher-student principle. Result: comparable quality at much lower compute.
- Embedding
- A numeric representation of text that encodes meaning. The basis for search, RAG, and similarity matching.
- Encoder
- The input side of a model that translates text into vectors. Counterpart: decoder. Many modern models use both.
- Epoch
- A complete pass of the training dataset through the model. Multiple epochs are needed — too many lead to overfitting.
- EU AI Act
- EU regulation that classifies AI systems by risk tier — bans, obligations, transparency duties. Phased rollout since 2024.
- Evaluation
- Systematic testing whether a model does what it should — via benchmarks, human judgement, or real-world A/B tests.
- Explainability
- A model’s property of making its decisions traceable. Critical in regulated domains (medicine, finance, law).
- Feature Engineering
- Preparing raw data so that a model actually sees the signals relevant to the task. In classical machine learning, a time-consuming process driven by domain knowledge — with deep learning and LLM models largely automatic, but careful data preparation remains mandatory.
- Federated Learning
- Instead of pooling all data centrally, multiple local nodes — smartphones, hospital servers, bank machines — train the model on their own data. Only the learned model parameters travel to a central place; the data itself stays where it is. Important wherever data cannot leave its local context for privacy, trust or compliance reasons.
- Few-Shot Prompting
- Giving the model two to five examples before it solves the task. Often more accurate than zero-shot, but costlier in tokens.
- Fine-Tuning
- Re-training a model on your own data. Powerful but costly — often RAG or better prompting is enough.
- Foundation Model
- A large, generally trained base model used as the starting point for many applications — GPT, Claude, Llama. Broad rather than specialized.
- Function Calling
- A model decides which external function (tool, API, database) to call to fulfill a task — rather than inventing the answer.
- Gemini
- Google’s multimodal model family. Tightly integrated into Google products (Workspace, Search, Android).
- Generative AI
- Umbrella term for AI systems that create new content — text, image, audio, video, code. As opposed to pure classification or prediction models.
- GPT
- Generative Pre-trained Transformer. Originally OpenAI’s model name, now a generic term for transformer-based text generation models.
- GPU
- Graphics Processing Unit. Hardware built for parallel compute — the engine of AI training. Nvidia dominates the market.
- Gradient Descent
- The optimization routine by which models learn: minimize error by nudging parameters step by step toward lower error.
- Guardrails
- Rules and filters that prevent a model from producing unwanted output — toxicity, hallucinations, confidential data.
- Hackathon
- A portmanteau of "hack" and "marathon", originally from software development. A time-boxed work sprint — typically 24 to 72 hours — in which teams work under time pressure on a concrete problem and present something tangible at the end: prototype, concept, solution. Today used as an innovation format inside companies, as a community event, or as a team-building instrument.
- Halluzination
- When a model confidently states something false. Not a bug — a side effect of statistical language.
- Human-in-the-Loop
- Design principle: a human reviews, corrects or confirms the AI output before it takes effect. Standard in critical applications.
- Hyperparameter
- Training settings not learned from data but set in advance — learning rate, batch size, layer count. Shape the result substantially.
- Inference
- Applying a trained model: input in, output out. Cheap per call compared to training — but costly in aggregate at scale.
- In-Context Learning
- A model’s ability to learn from examples inside the prompt — without retraining. Foundation of all few-shot techniques.
- Jailbreak
- Prompt technique that circumvents a model’s safety guardrails to produce forbidden output. An arms race between attackers and model operators.
- Knowledge Graph
- Structured knowledge as nodes and edges (entity — relation — entity). Foundation of semantic search and explainable AI systems.
- Latency
- Time between a request and the model’s first response. Decisive for interactive applications — too high breaks UX.
- Llama
- Meta’s open-weights model family. Freely usable, popular for self-hosted enterprise AI applications.
- LLM
- Large Language Model. A neural network trained on huge text corpora that predicts language — GPT, Claude, Gemini.
- LoRA
- Low-Rank Adaptation. Fine-tuning method that trains only small add-on weights — resource-efficient and composable.
- Machine Learning
- Umbrella term: systems that learn from data rather than being explicitly programmed. AI today is almost always machine learning.
- MCP
- Model Context Protocol. Open standard that lets AI systems access external tools, data, and services — vendor-agnostic.
- Memory
- An AI system’s ability to retain information beyond a single conversation. Foundation of personalized assistants — and a governance risk.
- Mistral
- The French model family from Mistral AI. A mix of open-weights and proprietary models, European-grounded.
- MLOps
- Operational discipline for running AI models: deployment, monitoring, versioning, rollback. Counterpart to DevOps in software.
- Model Card
- Datasheet for an AI model: capabilities, limits, training data, known risks. Transparency instrument — mandated by the EU AI Act.
- Model Collapse
- Quality decay when models are repeatedly trained on AI-generated data — each pass drifts the distribution further from reality.
- MoE
- Mixture of Experts. Architecture where only part of the network activates per request. Enables very large models without proportionally higher compute cost.
- Multi-Agent
- Multiple AI agents collaborate or challenge each other — debate, role splitting, verification. Lifts quality, but also cost and complexity.
- Multimodal
- A model that handles not only text but also images, audio, or video — within one conversation.
- Neural Network
- Mathematical model loosely inspired by the nervous system — layers of interconnected "neurons" that learn to recognize patterns.
- One-Shot Prompting
- Giving the model exactly one example before the task. A middle ground between zero-shot and few-shot.
- Open Source
- Model whose weights (and often training code) are publicly available — Llama, Mistral. Enables self-hosting, inspection and customization.
- Orchestration
- Coordinating multiple AI components in a flow — which model, when, with which context, followed by which action. Core of modern AI applications.
- Overfitting
- A model learns training data too well — reproduces it exactly but fails on new input. A sign of poor generalization.
- Parameter
- A model’s learnable weights. More parameters = more capacity but also more compute. Modern models: billions to trillions of parameters.
- PEFT
- Parameter-Efficient Fine-Tuning. Umbrella term for methods that adapt models with minimal resource cost — LoRA, adapters, prompt tuning.
- Perplexity
- Measure of how much a model is "surprised" by actual data. Lower = better predicted. Classical benchmark for language models.
- PII
- Personally Identifiable Information. Data that makes a person identifiable — name, email, IP. AI systems must handle PII with special care.
- Positional Encoding
- Technique that gives transformer models the order of tokens — without it the model would not know which word comes first.
- Pre-training
- A model’s first training phase on huge, general datasets. Establishes baseline understanding — expensive, rarely done in-house, usually by foundation model vendors.
- Prompt
- The instruction given to an AI model. Prompt quality decides answer quality — often more than the model itself.
- Prompt Engineering
- The craft of giving a model instructions clear and structured enough to produce reliable output. A discipline in its own right — more editorial than engineering.
- Prompt Injection
- A manipulated input that overrides a model’s original instructions — often via hidden text in documents or web pages. Primary attack vector for agents.
- Quantization
- Compressing model weights to fewer bits (e.g. 4-bit instead of 16-bit). Makes models smaller and faster — at small quality cost.
- RAG
- Retrieval-Augmented Generation. The model searches your documents first, then answers. Massively reduces hallucination.
- Reasoning Model
- A model optimized for multi-step reasoning — spends compute on intermediate steps before answering. Strong on math, logic, code.
- Red Teaming
- Systematically attacking a model with a dedicated team — to find weaknesses before going live. Required for high-risk AI systems.
- Reinforcement Learning
- Learning through reward and penalty — a model tries, gets feedback, improves. Basis of RLHF and agent training.
- Responsible AI
- The expectation that AI systems do more than function: that their decisions stay traceable, that bias gets tested, that in the end someone is accountable. Concretely: audit trails, explainability standards, data-protection mechanics. Most important where stakes are high — medicine, hiring, courts, credit decisions.
- Retrieval
- Finding relevant information in a data source — classically by keyword, today by embedding. Core of every RAG application.
- RLHF
- Reinforcement Learning from Human Feedback. Humans rate model responses, the model learns from that. The method that makes ChatGPT feel helpful.
- Role Prompting
- Assigning the model a role ("You are an experienced lawyer …") — to control tone, focus, and assumptions. Simplest lever for better results.
- Sampling
- The process by which a model chooses among possible next tokens. Controlled by temperature, top-k and top-p.
- Scaling Laws
- Empirical laws showing how model performance scales with size, data and compute budget. Basis for deciding to train larger.
- Semantic Search
- Searching by meaning rather than exact match. Uses embeddings to find similar content even when terms differ.
- Self-Supervised Learning
- Learning where labels are generated from the data itself — for example hiding a word and letting the model predict it. Basis of all LLM pre-training.
- Stop Sequence
- A predefined string whose appearance stops the model from generating further. Technical control over output length and format.
- Supervised Learning
- Learning from labeled examples: for each input the model knows the desired output during training.
- Sycophancy
- A model’s tendency to flatter the user — confirming positions rather than challenging them. A known side-effect of RLHF: kindly rated answers get learned as "correct".
- Synthetic Data
- Artificially generated training data — often produced by other models. Useful when data is scarce, risky when it leads to model collapse.
- System Prompt
- The invisible always-on instruction that shapes a model’s behaviour in an application — tone, rules, limits.
- Temperature
- Creativity dial. 0 = deterministic, repeatable. 1+ = experimental, variable. Low for facts, high for ideas.
- Token
- The smallest unit a model breaks text into — often a word fragment. Pricing and context are measured in tokens.
- Tokenizer
- Algorithm that splits raw text into tokens. How it slices determines how much text fits in a context window — and how expensive requests become.
- Tool Use
- A model’s ability to operate external tools — calculator, database, search, API. Turns a model into an actionable agent.
- Top-k / Top-p
- Two dials for generation diversity. Top-k: choose only from the k most likely tokens. Top-p: choose only from tokens whose cumulative probability exceeds p.
- Training
- The process by which a model learns from data — weights get adjusted until error is minimized. The most compute-intensive phase of the lifecycle.
- Training Data
- The dataset a model learns from. The model’s boundary: it can only reflect what was in the data (and its transformations).
- Transfer Learning
- Adapting a pre-trained model to a new task — rather than starting from scratch. Standard practice in modern AI development.
- Transformer
- The architecture behind nearly all today’s LLMs. Introduced by Google in 2017. Core: the attention mechanism that computes contextual relationships.
- Underfitting
- A model has learned too little to explain the data — subpar performance on both training and test. A sign of insufficient capacity or too-short training.
- Unsupervised Learning
- Learning without labels: the model finds structure in data on its own. Basis of pre-training and clustering.
- Validation Set
- A held-out slice of the dataset used to monitor training without burning the test set. Key defense against overfitting.
- Vector
- A list of numbers that encodes meaning in AI. Embeddings are vectors — in high-dimensional space, proximity means similarity.
- Vector Database
- Database that stores embeddings and enables fast similarity search — Pinecone, Weaviate, pgvector. Infrastructure backbone of RAG systems.
- Weights
- The numeric values adjusted during training that encode a model’s knowledge. Releasing a model means sharing its weights.
- Workflow Automation
- Orchestrating business processes via AI components — triaging emails, drafting reports, preparing decisions. Usually higher ROI than one-off chat queries.
- Zero-Shot
- Solving a task without giving the model examples. Fast, but more error-prone than few-shot.
Evaluation
Risiko
Praxis
Grundlagen
Grundlagen
Risiko
Praxis
Grundlagen
Anpassung
Grundlagen
Grundlagen
Evaluation
Risiko
Risiko
Prompting
Grundlagen
Grundlagen
Risiko
Prompting
Grundlagen
Praxis
Risiko
Grundlagen
Risiko
Grundlagen
Grundlagen
Evaluation
Grundlagen
Grundlagen
Anpassung
Grundlagen
Grundlagen
Grundlagen
Risiko
Evaluation
Risiko
Anpassung
Anpassung
Prompting
Anpassung
Grundlagen
Praxis
Grundlagen
Grundlagen
Grundlagen
Praxis
Grundlagen
Risiko
Praxis
Risiko
Praxis
Grundlagen
Grundlagen
Prompting
Risiko
Praxis
Praxis
Grundlagen
Grundlagen
Anpassung
Grundlagen
Praxis
Praxis
Grundlagen
Praxis
Risiko
Risiko
Grundlagen
Praxis
Grundlagen
Grundlagen
Prompting
Grundlagen
Praxis
Risiko
Grundlagen
Anpassung
Evaluation
Risiko
Grundlagen
Grundlagen
Prompting
Prompting
Risiko
Anpassung
Anpassung
Grundlagen
Risiko
Grundlagen
Risiko
Praxis
Grundlagen
Prompting
Prompting
Grundlagen
Praxis
Grundlagen
Prompting
Grundlagen
Risiko
Grundlagen
Prompting
Prompting
Grundlagen
Grundlagen
Praxis
Prompting
Grundlagen
Grundlagen
Anpassung
Grundlagen
Risiko
Grundlagen
Grundlagen
Grundlagen
Praxis
Grundlagen
Praxis
Prompting
What next?