Smoothing Options¶

The Labeler currently supports several built-in smoothing algorithms that can be automatically turned on/off (and tuned) from the GUI under "Graph Controls" on 1-D signals only. It should be noted that smoothing is generally a non-invertible transformation.

Exponential Moving Average (EMA)¶

Definition
Choose a span \(N\). The decay factor is

\[ \alpha=\frac{2}{N+1}, \qquad s_t=\alpha x_t+(1-\alpha)s_{t-1}. \]

Older samples never disappear but their influence decays exponentially.

Name	Purpose	Effect
`span N`	effective window length	↑ N → smoother, ↓ N → more responsive

Pros¶

O(1) memory & CPU — ideal for streams
Recency bias — tracks fresh trends quickly
Single knob — span sets smoothness

Cons¶

Needs tuning — wrong span chases noise or lags events
Short memory — distant past has tiny weight
Fixed decay — unaware of seasonality

Why choose it¶

Balanced smooth-vs-lag trade-off for dashboards and real-time sensors.

References¶

R. Hyndman & G. Athanasopoulos, Forecasting: Principles and Practice, § 8.1 “Exponential Smoothing.” [Open Text]
R. G. Brown, Smoothing, Forecasting and Prediction of Discrete Time Series (1963).

Savitzky–Golay Filter¶

Definition
Fit a polynomial of order \(p\) to each sliding window of length \(m\) and take the fitted value at the center.

Name	Purpose	Typical range
`window m`	odd number of points	5 – 51
`poly p`	polynomial degree	2 – 5 (< m)

Pros¶

Shape preservation — keeps peaks, valleys, shoulders
Strong denoising — polynomial fit suppresses random spikes
Derivative ready — same filter yields smooth \(s',s''\)
Fast after setup — coefficients reusable

Cons¶

Parameter sensitive — bad \(m,p\) cause oscillations or under-smooth
Edge padding needed — incomplete window at boundaries
Artifacts — high \(p\) may add wiggles in noisy data
Poor on step jumps — overshoots discontinuities

Why choose it¶

Ideal when local features must stay intact (spectroscopy, ECG, chromatography).

References¶

A. Savitzky & M. J. E. Golay, “Smoothing and Differentiation of Data,” Analytical Chemistry 36 (8), 1964. [DOI]
W. H. Press et al., Numerical Recipes, § 14.8 “Savitzky–Golay Smoothing Filters.” [Cambridge UP]

Gaussian Convolutional Filter¶

Definition
Convolve with a discrete Gaussian kernel

\[ g(i)=\frac{1}{\sqrt{2\pi}\,\sigma}\,e^{-\,i^{2}/(2\sigma^{2})}. \]

Name	Purpose	Effect
`sigma`	kernel width (samples)	↑ σ → heavier smoothing

Pros¶

Smooth kernel — zero ringing, excellent fidelity
Single scale parameter — σ intuitive
Edge friendly — reflective padding reduces bias
FFT option — efficient for very long signals

Cons¶

Detail loss — blurs sharp transitions
Single scale — multi-scale noise may need multiple passes
σ tuning — data-dependent

Why choose it¶

Quick, non-parametric denoiser with minimal artifacts.

References¶

R. C. Gonzalez & R. E. Woods, Digital Image Processing, 4 e, § 3.5 “Gaussian Smoothing.” [Pearson]
S. W. Smith, DSP Guide, ch. 15 “Gaussian Filters.” [Free PDF]

Butterworth Low-Pass Filter¶

A Butterworth filter gives the flattest possible pass-band below its cutoff and sharply attenuates higher frequencies.
When the filter is run forward + reverse (non-causal mode) it is zero-phase, keeping all signal features time-aligned.

UI → API mapping (Butterworth)¶

GUI field	Purpose	Allowed range / units	Default	Notes
Cut Off (`low_end_cutoff`)	cutoff as a fraction of Nyquist	\(10^{-5} < c < 1\)	0.1	0.1 → 10 % of Nyquist
Frequency (`sampling_freq`)	sampling rate \(f_s\) (Hz)	\(f_s>0\)	1 Hz	needed to compute Nyquist

Digital cutoff

\[ \text{normal_cutoff}= \frac{\texttt{low_end_cutoff}}{0.5\,\texttt{sampling_freq}}, \qquad 0<\text{normal_cutoff}<1. \]

Magnitude (order \(n\))

\[ |H(e^{j\omega})|= \frac{1}{\sqrt{\,1+\bigl[\tfrac{\tan(\omega/2)}{\tan(\pi\,\text{normal_cutoff}/2)}\bigr]^{2n}}}. \]

At \(\omega=\pi\cdot\text{normal_cutoff}\) the gain is \(1/\sqrt2\) (−3 dB).

Pros¶

Steep attenuation of high-frequency noise
Flat pass-band (no ripple)
Zero phase when run forward + reverse
Well-established in DSP practice

Cons¶

Irrecoverable loss above cutoff
Edge artifacts on very short records
Requires tuning to each signal’s spectrum
Assumes stationarity

When to use it¶

Noise sits above a fixed frequency and minimal distortion is required below.

References¶

A. V. Oppenheim & R. W. Schafer, Discrete-Time Signal Processing, 3 e, § 7.5 “Butterworth and Chebyshev IIR Design.” [Pearson]
S. W. Smith, DSP Guide, ch. 20 “IIR Filters: Butterworth, Chebyshev & Elliptic.” [Free PDF]

Writing Your Own Custom Smoother from Scratch¶

Custom smoothing algorithms can be added through the custom smoothing API. Here we provide an example of how to write a simple Moving Average smooth, and incorporate it into the dFL GUI via the data_provider ingestion script, alongside a utilities script.

Ingestion scripts in dFL are called Data Providers. To include a custom smoothing algorithm into dFL, one must first define a custom smoothing python dictionary, which can be located under the 'Configuration Dictionaries' section of the Data Provider sript. This dictionary determines the parameters needed for the custom smoother, any delimiters required, as well as the display name of those parameters. An example for the SMA smoother is given by:

=== "Python"

    custom_smoothing_options = {
        "simple_moving_average": {
            "display_name": "Simple Moving Average",
            "parameters": {
                "moving_average_window_size": {"default": 1, "min": 0, "max": None, "display_name": "Window Size"}
            },
            "function": simple_moving_average,
        },
    }

In this case, the configuration dictionary includes a python function that is imported from the utilities script, which will need to be added to the imports at the top of the Data provider script.

=== "Python"

from ga_offline_utilities import simple_moving_average

In the utilities script is where the python function itself is written; here depending only on numpy. Here we define the function, and any parameters needed for it.

=== "Python"

import numpy as np

def simple_moving_average(_shot_name, _signal_name, raw_signal, _times, parameters):
    """
    Apply simple moving average smoothing to a 1D array.

    This function smooths data using a uniform window and pads edges to maintain
    the original array length. The implementation uses convolution for efficiency.

    Args:
        _shot_name (str): The name of the shot being analyzed (unused)
        _signal_name (str): The name of the signal (unused)
        raw_signal (array-like): Input data to smooth
        _times (array-like): Time array of the shot (unused)
        parameters (dict): Dictionary containing smoothing parameters

    Parameters Dictionary Keys:
        - "moving_average_window_size": int
          Window size for moving average

    Returns:
        numpy.array: Smoothed data with same length as input

    Edge Handling:
        - Uses 'valid' convolution then pads edges with first/last smoothed values
        - Left padding: first smoothed value repeated
        - Right padding: last smoothed value repeated
    """
    # Convert input to numpy array
    data = np.array(raw_signal)
    window_size = parameters["moving_average_window_size"]

    # Create weights for simple moving average
    weights = np.ones(window_size) / window_size

    # Apply convolution
    smoothed = np.convolve(data, weights, mode="valid")

    # Pad the edges to maintain original length
    padding = window_size - 1
    left_pad = np.full(padding // 2, smoothed[0])
    right_pad = np.full(padding - padding // 2, smoothed[-1])

    return np.concatenate([left_pad, smoothed, right_pad])

The result is a simple moving average (SMA) smoother fully integrated into dFL's GUI that works on any multimodal dataset.

Simple Moving Average (SMA)¶

Definition

\[ s_t=\frac{1}{N}\sum_{i=0}^{N-1}x_{t-i}. \]

A true sliding window—points older than \(N\) samples are ignored.

Name	Purpose	Effect
`window N`	samples averaged	large N → very smooth, small N → detailed

Pros¶

Crystal-clear math — easy to explain
Deterministic lag — \((N-1)/2\) samples
Quick baseline — first look at noise level

Cons¶

Equal weights — blurs peaks and edges
More memory than EMA — stores whole window
Edge loss — first \(N-1\) outputs undefined

Why choose it¶

When transparency is critical and compute cost is a non-issue.

References¶

J. Hamilton, Time Series Analysis, § 6.1 “Moving Averages.” [Princeton UP]
R. H. Shumway & D. S. Stoffer, Time Series Analysis and Its Applications, § 2.1. [Springer]
S. W. Smith, The Scientist & Engineer’s Guide to DSP, ch. 15. [Free PDF]