Verfasst von / Written by Sebastian F. Genter
N
NaN Trap
A situation in numerical computing where a single invalid operation (e.g., division by zero) produces a NaN (Not a Number) value that propagates through subsequent calculations. This can halt model training by corrupting gradients or parameters. Common in poorly initialized models or datasets with invalid entries.
Natural Language Processing (NLP)
A field combining linguistics and AI to enable machines to process, analyze, and generate human language. Applications include machine translation, sentiment analysis, and chatbots. Modern NLP relies heavily on transformer architectures like BERT and GPT.
Natural Language Understanding (NLU)
A subset of NLP focused on extracting meaning and intent from text/speech. Goes beyond syntax to interpret context, sarcasm, and ambiguity. Powers virtual assistants (e.g., parsing "Set a reminder for my meeting tomorrow at 3 PM").
Negative Class
In binary classification, the label representing the absence of a target condition. For example, in medical testing, "non-cancerous" is the negative class. Contrasts with the positive class (the condition being detected).
Negative Sampling
A training optimization for recommendation systems or word embeddings where only a subset of negative examples is used. Reduces computational cost by avoiding calculations over all possible negatives (e.g., used in Word2Vec).
Neural Architecture Search (NAS)
Automated methods for designing optimal neural network architectures. Uses reinforcement learning or evolutionary algorithms to explore configurations, balancing performance and efficiency. Example: Google’s NASNet for image classification.
Neural Network
A computational model inspired by biological brains, composed of interconnected layers of artificial neurons. Processes data through weighted connections and activation functions. Types include CNNs (for images) and RNNs (for sequences).
Neuron
The basic unit of a neural network. Receives weighted inputs, applies an activation function (e.g., ReLU), and passes the output to the next layer. Mathematically: , where is the activation.
N-gram
A contiguous sequence of items (words, characters) from a text. Used in language modeling to predict the next item based on previous context. For example, a trigram (3-gram) model uses the prior two words to predict the third.
Node (Decision Tree)
A component in a decision tree representing either a decision rule (condition) or a terminal prediction (leaf). Splits data based on feature values (e.g., "Income > $50k") to partition examples into subgroups.
Node (Neural Network)
Synonym for an artificial neuron. Transforms inputs via weights and activations. In graph-based representations, nodes represent computational units, and edges represent data flow.
Node (TensorFlow Graph)
In TensorFlow's computation graph, a node represents an operation (e.g., addition, matrix multiplication). Nodes accept input tensors, perform computations, and output tensors to subsequent nodes.
Noise
Unwanted variability or errors in data. Sources include sensor errors, labeling mistakes, or irrelevant features. Noise can degrade model performance; techniques like data cleaning or regularization mitigate its impact.
Non-Binary Condition
A decision tree split with more than two possible outcomes. For example, a feature like "Education Level" might split into [High School, Bachelor’s, Master’s, PhD], each leading to different branches.
Nonlinear
Relationships where changes in input don’t produce proportional output changes. Neural networks model nonlinearity via activation functions (e.g., sigmoid). Essential for capturing complex patterns in data.
Non-Response Bias
A statistical bias occurring when survey participants differ systematically from non-participants. In ML, this skews training data if certain groups are underrepresented due to lack of response.
Nonstationarity
When data distributions change over time or context. For example, consumer preferences shifting seasonally. Models trained on nonstationary data may require frequent retraining to maintain accuracy.
No One Right Answer (NORA)
Tasks where multiple valid outputs exist, such as creative writing or open-ended Q&A. Evaluation metrics for NORA tasks (e.g., BLEURT) focus on semantic similarity rather than exact matches.
Normalization
Scaling features to a standard range (e.g., [0,1]) to improve training stability. Common methods include min-max scaling and z-score normalization. Critical for models sensitive to input scales (e.g., SVMs).
Novelty Detection
Identifying data points that deviate significantly from training data patterns. Used in fraud detection or quality control. Contrasts with outlier detection, which flags anomalies within known data.
Numerical Data
Features represented as continuous or discrete numbers (e.g., age, temperature). Requires mathematical interpretation. Contrast with categorical data (e.g., gender), which represents discrete groups.
NumPy
A foundational Python library for numerical computing. Provides array objects, linear algebra functions, and tools for integrating C/C++ code. Essential for data manipulation in pandas and ML frameworks.