Verfasst von / Written by Sebastian F. Genter
H
hallucination
When generative models produce plausible-sounding but factually incorrect or nonsensical outputs. Common in large language models, such as inventing fake historical events or citing non-existent sources. For example: A model claiming "The Treaty of Verona (1822) established underwater railways" when no such treaty exists.
hashing
Technique for converting categorical features into fixed-size numerical representations using hash functions. Enables efficient handling of high-cardinality features (e.g., user IDs) by mapping them to predefined buckets. Trade-off: May cause collisions where different values map to same bucket.
heuristic
Rule-based approach providing practical solutions without guaranteed optimality. Common ML applications include:
- Initial feature selection
- Setting baseline performance
- Designing simple decision rules before model deployment
Example: Using "if contains 'free offer' then spam" as email filter heuristic.
Intermediate processing stages in neural networks between input and output. Each layer applies:
- Weighted sum of inputs
- Nonlinear activation (ReLU, sigmoid)
- Feature transformation
Deep networks stack multiple hidden layers to learn hierarchical representations.
hierarchical clustering
Cluster analysis method creating nested groupings through either:
- Agglomerative (bottom-up): Merge similar clusters
- Divisive (top-down): Split dissimilar clusters
Produces dendrograms showing relationships at different scales. Unlike k-means, doesn't require pre-specifying cluster count.
hill climbing
Local optimization strategy iteratively adjusting parameters to improve performance. Used for:
- Hyperparameter tuning
- Feature selection
- Architecture search
Limitation: May get stuck in local optima rather than finding global best solution.
hinge loss
Loss function for maximum-margin classification, defined as: Key component in Support Vector Machines (SVMs), penalizing predictions:
- Correct but low-confidence (<1 margin)
- Incorrect classifications
historical bias
Systemic distortions inherited from training data reflecting past inequities. Manifestations include:
- Gender stereotypes in hiring models
- Racial disparities in loan approvals
- Age discrimination in marketing algorithms
Requires careful data auditing and debiasing techniques.
holdout data
Data subset excluded from training for final model evaluation. Typically split as:
- 60-80% training
- 10-20% validation
- 10-20% testing
Prevents information leakage and gives unbiased performance estimate.
host
In distributed systems, the central processor coordinating:
- Data loading
- Device synchronization
- Checkpointing
While accelerators (GPUs/TPUs) perform parallel computations. Acts as orchestration layer in training pipelines.
human evaluation
Manual assessment of model outputs using criteria like:
- Fluency (for text)
- Realism (for images)
- Relevance (for recommendations)
Essential for tasks without clear quantitative metrics, like creative writing generation.
human in the loop (HITL)
Hybrid systems combining AI automation with human oversight. Common implementations:
- Validation: Humans verify critical decisions
- Active learning: Humans label uncertain cases
- Error correction: Humans fix model mistakes
Used in medical diagnosis and legal document review.
hyperparameter
Configurable settings controlling model behavior, distinct from learned parameters. Includes:
- Learning rate
- Network depth
- Regularization strength
Optimized through grid search, random search, or Bayesian methods.
hyperplane
Decision boundary in n-dimensional space separating data classes. For 2D: line, 3D: plane. Support Vector Machines maximize margin around hyperplane to improve generalization. Defined by equation: