Machine Learning Glossary/N

Verfasst von / Written by Sebastian F. Genter

Machine Learning Glossary

N

NaN Trap

A situation in numerical computing where a single invalid operation (e.g., division by zero) produces a NaN (Not a Number) value that propagates through subsequent calculations. This can halt model training by corrupting gradients or parameters. Common in poorly initialized models or datasets with invalid entries.

Natural Language Processing (NLP)

A field combining linguistics and AI to enable machines to process, analyze, and generate human language. Applications include machine translation, sentiment analysis, and chatbots. Modern NLP relies heavily on transformer architectures like BERT and GPT.

Natural Language Understanding (NLU)

A subset of NLP focused on extracting meaning and intent from text/speech. Goes beyond syntax to interpret context, sarcasm, and ambiguity. Powers virtual assistants (e.g., parsing "Set a reminder for my meeting tomorrow at 3 PM").

Negative Class

In binary classification, the label representing the absence of a target condition. For example, in medical testing, "non-cancerous" is the negative class. Contrasts with the positive class (the condition being detected).

Negative Sampling

A training optimization for recommendation systems or word embeddings where only a subset of negative examples is used. Reduces computational cost by avoiding calculations over all possible negatives (e.g., used in Word2Vec).

Neural Architecture Search (NAS)

Automated methods for designing optimal neural network architectures. Uses reinforcement learning or evolutionary algorithms to explore configurations, balancing performance and efficiency. Example: Google’s NASNet for image classification.

Neural Network

A computational model inspired by biological brains, composed of interconnected layers of artificial neurons. Processes data through weighted connections and activation functions. Types include CNNs (for images) and RNNs (for sequences).

Neuron

The basic unit of a neural network. Receives weighted inputs, applies an activation function (e.g., ReLU), and passes the output to the next layer. Mathematically: $y = f (\sum w_{i} x_{i} + b)$ , where $f$ is the activation.

N-gram

A contiguous sequence of $N$ items (words, characters) from a text. Used in language modeling to predict the next item based on previous context. For example, a trigram (3-gram) model uses the prior two words to predict the third.

Node (Decision Tree)

A component in a decision tree representing either a decision rule (condition) or a terminal prediction (leaf). Splits data based on feature values (e.g., "Income > $50k") to partition examples into subgroups.

Node (Neural Network)

Synonym for an artificial neuron. Transforms inputs via weights and activations. In graph-based representations, nodes represent computational units, and edges represent data flow.

Node (TensorFlow Graph)

In TensorFlow's computation graph, a node represents an operation (e.g., addition, matrix multiplication). Nodes accept input tensors, perform computations, and output tensors to subsequent nodes.

Noise

Unwanted variability or errors in data. Sources include sensor errors, labeling mistakes, or irrelevant features. Noise can degrade model performance; techniques like data cleaning or regularization mitigate its impact.

Non-Binary Condition

A decision tree split with more than two possible outcomes. For example, a feature like "Education Level" might split into [High School, Bachelor’s, Master’s, PhD], each leading to different branches.

Nonlinear

Relationships where changes in input don’t produce proportional output changes. Neural networks model nonlinearity via activation functions (e.g., sigmoid). Essential for capturing complex patterns in data.

Non-Response Bias

A statistical bias occurring when survey participants differ systematically from non-participants. In ML, this skews training data if certain groups are underrepresented due to lack of response.

Nonstationarity

When data distributions change over time or context. For example, consumer preferences shifting seasonally. Models trained on nonstationary data may require frequent retraining to maintain accuracy.

No One Right Answer (NORA)

Tasks where multiple valid outputs exist, such as creative writing or open-ended Q&A. Evaluation metrics for NORA tasks (e.g., BLEURT) focus on semantic similarity rather than exact matches.

Normalization

Scaling features to a standard range (e.g., [0,1]) to improve training stability. Common methods include min-max scaling and z-score normalization. Critical for models sensitive to input scales (e.g., SVMs).

Novelty Detection

Identifying data points that deviate significantly from training data patterns. Used in fraud detection or quality control. Contrasts with outlier detection, which flags anomalies within known data.

Numerical Data

Features represented as continuous or discrete numbers (e.g., age, temperature). Requires mathematical interpretation. Contrast with categorical data (e.g., gender), which represents discrete groups.

NumPy

A foundational Python library for numerical computing. Provides array objects, linear algebra functions, and tools for integrating C/C++ code. Essential for data manipulation in pandas and ML frameworks.