Normalization Options¶

The Labeler currently supports several built-in normalization algorithms that can be toggled on/off (and hand-tuned) from the GUI. Normalizations can be accessed from the "Graph Controls" dropdown menu under "Normalizing," where any free parameters associated to any method can be set after selecting the normalization method to use.

Standard Score (Z-Score)¶

Definition

\[ z \;=\; \frac{x - \mu}{\sigma} \]

Transforms each value to units of standard deviation so the feature has mean 0 and std 1.

Pros¶

Centers & scales — features become comparable; speeds up many optimizers.
Keeps distribution shape — important when data are roughly normal.
Model friendly — works well for distance- and gradient-based algorithms.
More robust than min-max — extreme values stretch the scale less dramatically.

Cons¶

Outlier sensitive — μ and σ can be skewed by extremes.
Unbounded output — not ideal for activations that expect a fixed range.
Needs near-normality — heavy skew can still hurt performance.
Streaming overhead — μ and σ must be updated as data arrive.

Why choose Z-Score¶

Versatile default for linear models, SVMs, KNN, neural nets.
Balanced gradients often shorten training time.
Z-values are easy to interpret in anomaly detection.

References¶

scikit-learn documentation — StandardScaler. [Link]

Min-Max Scaling¶

Definition

\[ x' = \frac{x - \min(x)}{\max(x) - \min(x)} \]

Linearly rescales data to a fixed interval (usually [0, 1]).

Pros¶

Fixed bounds — perfect for sigmoid / tanh activations.
Preserves ordering — distribution shape unchanged (linear stretch only).
Stable gradients — accelerates convergence in deep nets.
Easy to read — values fall in an intuitive, bounded range.
Equal feature influence — when no extreme outliers are present.

Cons¶

Outlier driven — extremes compress the rest of the data.
No centering — mean may be far from 0.
Range drift — min/max must be tracked in streaming scenarios.
Small-range precision loss — tiny intervals may hide variation.
Linear only — can’t fix nonlinear scale issues.

Why choose Min-Max¶

Go-to scaler for neural networks and distance metrics when data lack outliers.
Keeps original distribution intact while bounding values.
Low computational cost for large matrices.

References¶

I. Goodfellow, Y. Bengio & A. Courville, Deep Learning, § 6.3 “Feature Scaling.” [Book]
scikit-learn documentation — MinMaxScaler. [Link]

Scalar Normalization¶

Definition

\[ x' \;=\; \lambda\,x \]

Multiplies all values by the same constant λ.

Pros¶

Trivial to compute — one multiply per value.
Preserves proportions — relative structure intact.
Shape invariant — no change to distribution form.
Unit alignment — quick fix when magnitudes differ but units match.
Great for physics data — keeps physical meaning while rescaling.

Cons¶

Doesn’t equalize spread — variances stay unequal.
Outliers untouched — extremes remain extreme.
No centering — mean shift not addressed.
λ selection matters — poor choice ruins the benefit.
Limited with mixed units — other scalers work better when spreads vary widely.

Why choose Scalar¶

Fast magnitude adjustment for uniformly scaled sensors or waveforms.
Useful when physical ratios must remain intact.
Computationally negligible for high-frequency streaming.

References¶

A. Géron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, § 2.4 “Feature Scaling.” [O’Reilly]
Feature scaling overview — Wikipedia. [Link]

Box-Cox Transformation (Power Normalization with Shift)¶

Definition

Applies a power-law transform after shifting the data to ensure positivity:

\[ x' = \begin{cases} \dfrac{(x + \delta)^{\lambda}-1}{\lambda}, & \lambda \neq 0,\\[6pt] \ln(x + \delta), & \lambda = 0. \end{cases} \]

Here, \(\delta > -\min(x)\) ensures that all shifted values \(x+\delta\) are strictly positive.
\(\lambda\) is a continuous shape parameter, often estimated via maximum likelihood to bring the distribution close to Gaussian.

Pros¶

Variance stabilization — reduces heteroscedastic noise.
Skew reduction — shifts many heavy-tailed sets toward normality.
Shift extension — works on zero or negative data by offsetting with \(\delta\).
Single tunable knob — \(\lambda\) sweeps from log to square-root to identity.
Invertible — transformed data can be mapped back to original units.
Model accuracy boost — improves fit for models assuming normality (e.g., regression, ARIMA).

Cons¶

Choice of δ matters — different offsets can change results, especially with outliers.
λ search cost — optimization adds computation.
Outlier leverage — extremes can bias both \(\lambda\) and \(\delta\).
Non-linear ordering — local ranks may change, affecting distance metrics.

Why choose Box-Cox¶

One-step fix for skewed, heteroscedastic, or shifted features.
Extends applicability to datasets with zeros or negatives via \(\delta\).
Built into SciPy (scipy.stats.boxcox) and scikit-learn (power_transform).
Standard in forecasting, statistical process control, and fusion diagnostics.

References¶

G. E. P. Box & D. R. Cox, “An Analysis of Transformations,” Journal of the Royal Statistical Society B 26 (2), 1964. [JSTOR]
scipy.stats.boxcox documentation. [Link]

Writing Your Own Custom Normalizations from Scratch¶

Custom normalization algorithms can be added through the custom normalization API. Here we provide an example of how to write a simple Robust Scaling algorithm, and incorporate it into the dFL GUI via the data_provider ingestion script, alongside a utilities script.

Ingestion scripts in dFL are called Data Providers. To include a custom normalization algorithm into dFL, one must first define a custom normalization python dictionary, which can be located under the 'Configuration Dictionaries' section of the Data Provider sript. This dictionary determines the parameters needed for the custom smoother, any delimiters required, as well as the display name of those parameters. An example for Robust Scaling is given by:

=== "Python"

    # --- Configuration Dictionaries ---
    custom_normalizing_options = {
        "robust_scaling": {"display_name": "Robust Scaling", "parameters": None, "function": robust_scaling}
    }

In this case, the configuration dictionary includes a python function that is imported from the utilities script, which will need to be added to the imports at the top of the Data provider script.

=== "Python"

from ga_offline_utilities import robust_scaling

In the utilities script is where the python function itself is written; here depending only on numpy. Here we define the function, and any parameters needed for it.

=== "Python"

import numpy as np

def robust_scaling(shot_name, signal_name, raw_signal, times, parameters):
    """
    Apply robust scaling to a 1D array with comprehensive edge case handling.

    Robust scaling uses median and interquartile range (IQR) instead of mean
    and standard deviation, making it less sensitive to outliers. This is
    particularly important for data that contains transient events and noise spikes.

    Formula: (x - median) / IQR

    Args:
        shot_name (str): The name of the shot being analyzed (unused)
        signal_name (str): The name of the signal (unused)
        raw_signal (array-like): Input data to scale
        times (array-like): Time array of the shot (unused)
        parameters (dict): The dictionary of available parameters (unused)

    Returns:
        numpy.array: Robustly scaled data

    Edge Case Handling:
        - Empty input: returns empty array
        - All identical values: returns zeros
        - Zero IQR with non-constant data: falls back to standard deviation
        - Single element: returns zero
        - Extreme values: clipped to ±1e6 to prevent infinity
    """
    data = np.asarray(raw_signal)

    # Handle empty input
    if data.size == 0:
        return np.array([])

    # Calculate robust statistics
    median = np.median(data)
    q1, q3 = np.percentile(data, [25, 75])
    iqr = q3 - q1

    # Handle edge cases
    if iqr == 0:
        # Case 1: All values identical
        if np.all(data == data[0]):
            return np.zeros_like(data)
        # Case 2: Use std dev as fallback for non-constant data with zero IQR
        iqr = np.std(data)
        # Final fallback if std dev is also zero
        if iqr == 0:
            iqr = 1.0

    # Handle single-element edge case
    if data.size == 1:
        return np.array([0.0])  # Single value becomes zero after scaling

    # Apply scaling with numerical stability
    scaled_data = (data - median) / iqr

    # Clip extreme values to prevent +/- infinity (optional)
    scaled_data = np.clip(scaled_data, -1e6, 1e6)

    return scaled_data

The result is a Robust Scaling normalizer fully integrated into dFL's GUI that works on any multimodal dataset.

Robust Scaling (Median–IQR)¶

Definition

Centers by median and scales by inter-quartile range:

\[ x' = \frac{x - \text{median}(x)}{Q_3 - Q_1} \]

Pros¶

Outlier immune — median and IQR ignore extremes.
Distribution agnostic — works for skewed or heavy-tailed data.
Centers & scales — comparable features like Z-score without outlier drift.
Online friendly — quantile sketches enable streaming updates.
Stable gradients — protects optimizers from spike-induced blow-ups.

Cons¶

Unbounded output — may still need clipping for certain activations.
Slightly less efficient on pure Gaussians — Z-score has lower variance when normality is perfect.
Quantile cost — accurate Q1/Q3 can be heavy on huge datasets.
Less intuitive units — values are in “IQR units,” not σ.

Why choose Robust¶

First defense when you expect dirty data, sensor spikes, or don't understand your data.
Drop-in via scikit-learn’s RobustScaler; no manual tuning required.
Keeps model training stable across finance, IoT, biomedical, and more.

References¶

P. J. Huber & E. M. Ronchetti, Robust Statistics, 2 e. [Wiley]
scikit-learn documentation — RobustScaler. [Link]

Implementing Serial Normalizations¶

In practice, hybrid workflows often combine methods, e.g., robust scaling followed by Min-Max or Box-Cox followed by \(Z\)-score, tailored to the statistical and physical structure of the dataset, and the requirements of the modeling target. The hybrid serial normalizations may be easily implemented in dFL in the same way as implementing new normalization from scratch.

To include a serial normalization to dFL first define a custom normalization python dictionary, which is located under the 'Configuration Dictionaries' section of the Data Provider sript. This dictionary determines the parameters needed for the custom smoother, any delimiters required, as well as the display name of those parameters. Here we show an example that implements Box-Cox follwoed by a \(Z\)-score normalization.

=== "Python"

from ga_offline_utilities import box_cox_and_zscore_normalize

    # --- Configuration Dictionaries ---
    custom_normalizing_options = {
        "robust_scaling": {"display_name": "Robust Scaling", "parameters": None, "function": robust_scaling},
        "box_cox_and_zscore_normalize": {"display_name": "Box-Cox and Z-score Normalize", "parameters": {
            "lambda_value": {"default": 2.0, "min": -5.0, "max": 5.0, "display_name": "Lambda Value"},
            "shift_value": {"default": 3.0, "min": 0.0, "max": None, "display_name": "Shift Value"}
        }, "function": box_cox_and_zscore_normalize}
    }

Then, in the corresponding ga_offline_utilities module, the normalizations algorithms are defined and called serially.

=== "Python"

import numpy as np

def zscore_normalize(data):
    mean = np.mean(data)
    std = np.std(data)

    # Handle zero standard deviation case
    if std == 0:
        # If std is zero, all data points are the same (equal to the mean),
        # so their Z-score is 0.
        return np.array([0.0] * len(data))  # Return a NUMPY ARRAY of zeros

    # Original calculation if std is non-zero
    zscore_normalized_data = [(x - mean) / std for x in data]
    return np.array(zscore_normalized_data)  # Return a NUMPY ARRAY

def box_cox_normalize(data, lambda_value, shift_value):
    """
    Apply Box-Cox transformation with robust numerical handling.

    Parameters:
    data (array-like): Input data to transform
    lambda_value (float): Transformation parameter (-5 ≤ λ ≤ 5)
    shift_value (float): Shift parameter (≥0) to ensure positive values

    Returns:
    np.ndarray: Transformed data
    """
    data = np.asarray(data, dtype=np.float64)
    shifted_data = data + shift_value

    # Handle non-positive values after shifting
    min_positive = np.finfo(shifted_data.dtype).tiny  # ~2.2e-308 for float64
    shifted_data = np.clip(shifted_data, min_positive, None)

    # Numerical stability for near-zero lambda
    lambda_threshold = 1e-7
    use_log = abs(lambda_value) < lambda_threshold

    with np.errstate(divide="ignore", invalid="ignore", over="ignore"):
        if use_log:
            transformed = np.log(shifted_data)
        else:
            # Use exp/log to avoid direct power computation
            log_shifted = np.log(shifted_data)
            transformed = np.expm1(lambda_value * log_shifted) / lambda_value

    # Handle extreme values and numerical artifacts
    transformed = np.clip(transformed, -1e12, 1e12)

    # Replace invalid values with nearest finite values
    transformed = np.nan_to_num(transformed, nan=0.0, posinf=1e12, neginf=-1e12)

    return transformed

def box_cox_and_zscore_normalize(shot_name, signal_name, raw_signal, times, parameters):
    """
    Apply Box-Cox transformation and Z-score normalization to a 1D array.

    This function applies the Box-Cox transformation and then normalizes the data using Z-score normalization.

    """
    data = np.array(raw_signal)
    lambda_value = parameters.get("box_cox_and_zscore_normalize_lambda_value", 0.0)
    shift_value = parameters.get("box_cox_and_zscore_normalize_shift_value", 0.0)

    # Apply Box-Cox transformation
    box_cox_normalized_data = box_cox_normalize(data, lambda_value, shift_value)

    # Apply Z-score normalization
    zscore_normalized_data = zscore_normalize(box_cox_normalized_data)

    return zscore_normalized_data