Menü aufrufen
Toggle preferences menu
Persönliches Menü aufrufen
Nicht angemeldet
Ihre IP-Adresse wird öffentlich sichtbar sein, wenn Sie Änderungen vornehmen.

Machine Learning Glossary/C

Aus VELEVO®.WIKI

Verfasst von / Written by Sebastian F. Genter


C

calibration layer

Post-prediction adjustment aligning model outputs with observed distributions. Corrects systematic biases through:

  • Platt scaling (logistic calibration)
  • Isotonic regression

Critical for reliability in probabilistic forecasting.

candidate generation

Initial recommendation phase filtering large catalogs to manageable options. Strategies:

  • Collaborative filtering
  • Content-based filtering
  • Embedding similarity search

Reduces 100,000 items → 500 candidates for detailed ranking.

candidate sampling

Training optimization for large output spaces:

  • Evaluates all positive labels
  • Samples subset of negative labels

Reduces computation in scenarios like extreme classification (millions of classes).

categorical data

Discrete features with finite possible values:

  • Nominal: Colors {red, blue, green}
  • Ordinal: Ratings {poor, fair, good}

Encoded via one-hot, embeddings, or target encoding.

causal language model

Unidirectional models predicting next token using left context:

  • GPT architecture
  • Masked future tokens during training

Contrasts with bidirectional models like BERT.

centroid

Cluster center in partitioning methods:

  • k-means: Mean of cluster points
  • k-median: Median reduces outlier sensitivity

Updated iteratively during clustering.

centroid-based clustering

Partitioning approach grouping data around central points:

  • k-means (most common)
  • k-medoids
  • BIRCH (hierarchical variant)

Requires predefined cluster count (k).

chain-of-thought prompting

LLM technique eliciting step-by-step reasoning: "Calculate gravitational force between Earth and Moon. Show equations." Forces explicit calculation rather than direct answers.

chat

Conversational interface preserving dialog history:

  • Context window management
  • Multi-turn interaction tracking

Applications: Customer service bots, AI companions.

checkpoint

Model state preservation:

  • Training: Resume from interruption
  • Deployment: Version control

Contains weights, optimizer state, and metadata.

class

Discrete prediction category:

  • Binary: {spam, not_spam}
  • Multiclass: {cat, dog, horse}
  • Multilabel: Multiple simultaneous classes

classification model

Predictor outputting discrete labels:

  • Logistic regression (probabilistic)
  • SVM (maximum margin)
  • Decision trees (rule-based)

Contrasts with regression models.

classification threshold

Probability cutoff converting scores to classes:

  • Default 0.5 for binary
  • Tuning affects precision/recall tradeoff

Example: Cancer diagnosis vs spam filtering thresholds.

classifier

(See classification model)

class-imbalanced dataset

Skewed class distribution challenges:

  • 1,000,000:1 negative:positive ratio
  • Mitigation: Resampling, class weights, anomaly detection

clipping

Outlier handling techniques:

  • Feature clipping: Cap extreme values (e.g., ages >100 →100)
  • Gradient clipping: Prevent exploding gradients

Stabilizes training and numerical computations.

Cloud TPU

Google's tensor processing units:

  • Matrix multiplication optimization
  • Pod configurations for large models
  • Integrated with TensorFlow/JAX

clustering

Unsupervised grouping methods:

  • Centroid-based (k-means)
  • Density-based (DBSCAN)
  • Hierarchical (agglomerative)

Applications: Customer segmentation, anomaly detection.

co-adaptation

Neural network pathology where:

  • Neurons over-specialize to specific patterns
  • Reduces generalization

Mitigated via dropout regularization.

collaborative filtering

Recommendation technique using user-item interactions:

  • User-based: "Customers like you bought..."
  • Item-based: "People who bought X also bought Y"

Matrix factorization approaches (SVD, ALS).

concept drift

Data distribution shifts over time:

  • Feature meaning changes (e.g., "fuel efficiency" standards)
  • Requires model retraining/monitoring

Detection methods: Statistical process control.

condition

Decision tree splitting rule:

  • Axis-aligned: single feature threshold
  • Oblique: multiple feature combinations

Determines data partitioning path.

confabulation

(LLM Hallucination) Plausible but incorrect generations:

  • Factual errors in summaries
  • Fictional citations

Mitigation: Retrieval augmentation, grounding.

configuration

Model setup parameters:

  • Hyperparameters (learning rate, layers)
  • Architectural choices (optimizer type)

Managed via config files (YAML/JSON) or libraries (Gin).

confirmation bias

Human tendency favoring information confirming existing beliefs:

  • Dataset collection bias
  • Labeling subjectivity
  • Metric selection skew

confusion matrix

Classification performance visualization:

Confusion Matrix
Actual\Predicted Positive Negative
Positive TP FN
Negative FP TN

Derives metrics: Accuracy, Precision, Recall, F1.

constituency parsing

Sentence structure analysis breaking text into nested phrases:

  • Noun phrases
  • Verb phrases
  • Prepositional phrases

Used in grammar checking, information extraction.

contextualized language embedding

Word representations adapting to context:

  • "Bank" → financial vs river meanings
  • BERT-style dynamic embeddings

Superior to static embeddings (Word2Vec).

context window

LLM input token capacity:

  • Early models: 512 tokens
  • Modern: 8k-128k tokens

Determines document processing capabilities.

continuous feature

Numerical variables with infinite possible values:

  • Temperature measurements
  • Sensor readings

Requires normalization/scaling for model stability.

convenience sampling

Non-probabilistic data collection:

  • Quick experiments
  • Potential sampling bias

Should transition to stratified/random sampling.

convergence

Training stability state when:

  • Loss plateaus
  • Parameter updates become negligible

Indicates model readiness (or local optimum).

convex function

Mathematical property enabling optimization:

  • Bowl-shaped curve
  • Single global minimum

Examples: L2 loss, logistic loss.

convex optimization

Minimizing convex functions:

  • Guaranteed convergence
  • Gradient descent variants

Foundation of linear models.

convex set

Geometric property where line between any two points remains within set:

  • Spheres
  • Cubes

Non-convex example: Star shape.

convolution

Matrix operation extracting spatial/temporal patterns:

  • Kernel sliding across input
  • Element-wise multiplication + summation

Basis for CNNs in image processing.

convolutional filter

Feature detector kernels:

  • Edge detection: [[-1,0,1], [-1,0,1], [-1,0,1]]
  • Learned during training

Depthwise separable variants reduce parameters.

convolutional layer

CNN component applying filters:

  • Stride controls overlap
  • Padding preserves dimensions
  • Channels manage feature depth

convolutional neural network

Architecture for grid-like data:

  • Convolution → Pooling → Dense
  • Local connectivity → Translation invariance

Dominates computer vision tasks.

convolutional operation

Feature map calculation process: (I*K)[i,j]=mnI[im,jn]K[m,n] Where I=input, K=kernel.

cost

(See loss)

co-training

Semi-supervised method using:

  • Multiple views of data
  • Complementary feature sets

Example: Web page classification using text + link structure.

counterfactual fairness

Fairness criterion requiring:

  • Same prediction for individuals differing only in protected attributes
  • "What if?" scenario analysis

Mathematically formalized through causal models.

coverage bias

Dataset incompleteness issues:

  • Missing population segments
  • Unrepresentative sampling

Leads to model underspecification.

crash blossom

Ambiguous phrasing challenging NLU: "Stolen painting found by tree"

  • Found near tree?
  • Found using tree method?

Requires world knowledge for disambiguation.

critic

Reinforcement learning component:

  • Estimates value functions
  • DQN: Q-value approximator

Guides policy improvement through evaluation.

cross-entropy

Multiclass loss function: H(p,q)=xp(x)logq(x) Minimized during classification training.

cross-validation

Robust evaluation protocol:

  • k-fold data partitioning
  • Rotation of train/validation splits

Prevents overfitting to single split.

cumulative distribution function (CDF)

Probability analysis tool: F(x)=P(Xx) Used for:

  • Statistical testing
  • Data distribution analysis
  • Quantile calculations