Skip to content

Smoothing Options

The Labeler currently supports several built-in smoothing algorithms that can be automatically turned on/off (and tuned) from the GUI under "Graph Controls" on 1-D signals only. It should be noted that smoothing is generally a non-invertible transformation.


Exponential Moving Average (EMA)

Definition
Choose a span \(N\). The decay factor is

\[ \alpha=\frac{2}{N+1}, \qquad s_t=\alpha x_t+(1-\alpha)s_{t-1}. \]

Older samples never disappear but their influence decays exponentially.

Name Purpose Effect
span N effective window length ↑ N → smoother, ↓ N → more responsive

Pros

  • O(1) memory & CPU — ideal for streams
  • Recency bias — tracks fresh trends quickly
  • Single knob — span sets smoothness

Cons

  • Needs tuning — wrong span chases noise or lags events
  • Short memory — distant past has tiny weight
  • Fixed decay — unaware of seasonality

Why choose it

Balanced smooth-vs-lag trade-off for dashboards and real-time sensors.

References

  • R. Hyndman & G. Athanasopoulos, Forecasting: Principles and Practice, § 8.1 “Exponential Smoothing.” [Open Text]
  • R. G. Brown, Smoothing, Forecasting and Prediction of Discrete Time Series (1963).

Savitzky–Golay Filter

Definition
Fit a polynomial of order \(p\) to each sliding window of length \(m\) and take the fitted value at the center.

Name Purpose Typical range
window m odd number of points 5 – 51
poly p polynomial degree 2 – 5 (< m)

Pros

  • Shape preservation — keeps peaks, valleys, shoulders
  • Strong denoising — polynomial fit suppresses random spikes
  • Derivative ready — same filter yields smooth \(s',s''\)
  • Fast after setup — coefficients reusable

Cons

  • Parameter sensitive — bad \(m,p\) cause oscillations or under-smooth
  • Edge padding needed — incomplete window at boundaries
  • Artifacts — high \(p\) may add wiggles in noisy data
  • Poor on step jumps — overshoots discontinuities

Why choose it

Ideal when local features must stay intact (spectroscopy, ECG, chromatography).

References

  • A. Savitzky & M. J. E. Golay, “Smoothing and Differentiation of Data,” Analytical Chemistry 36 (8), 1964. [DOI]
  • W. H. Press et al., Numerical Recipes, § 14.8 “Savitzky–Golay Smoothing Filters.” [Cambridge UP]

Gaussian Convolutional Filter

Definition
Convolve with a discrete Gaussian kernel

\[ g(i)=\frac{1}{\sqrt{2\pi}\,\sigma}\,e^{-\,i^{2}/(2\sigma^{2})}. \]
Name Purpose Effect
sigma kernel width (samples) ↑ σ → heavier smoothing

Pros

  • Smooth kernel — zero ringing, excellent fidelity
  • Single scale parameter — σ intuitive
  • Edge friendly — reflective padding reduces bias
  • FFT option — efficient for very long signals

Cons

  • Detail loss — blurs sharp transitions
  • Single scale — multi-scale noise may need multiple passes
  • σ tuning — data-dependent

Why choose it

Quick, non-parametric denoiser with minimal artifacts.

References

  • R. C. Gonzalez & R. E. Woods, Digital Image Processing, 4 e, § 3.5 “Gaussian Smoothing.” [Pearson]
  • S. W. Smith, DSP Guide, ch. 15 “Gaussian Filters.” [Free PDF]

Butterworth Low-Pass Filter

A Butterworth filter gives the flattest possible pass-band below its cutoff and sharply attenuates higher frequencies.
When the filter is run forward + reverse (non-causal mode) it is zero-phase, keeping all signal features time-aligned.

UI → API mapping (Butterworth)

GUI field Purpose Allowed range / units Default Notes
Cut Off (low_end_cutoff) cutoff as a fraction of Nyquist \(10^{-5} < c < 1\) 0.1 0.1 → 10 % of Nyquist
Frequency (sampling_freq) sampling rate \(f_s\) (Hz) \(f_s>0\) 1 Hz needed to compute Nyquist

Digital cutoff

\[ \text{normal_cutoff}= \frac{\texttt{low_end_cutoff}}{0.5\,\texttt{sampling_freq}}, \qquad 0<\text{normal_cutoff}<1. \]

Magnitude (order \(n\))

\[ |H(e^{j\omega})|= \frac{1}{\sqrt{\,1+\bigl[\tfrac{\tan(\omega/2)}{\tan(\pi\,\text{normal_cutoff}/2)}\bigr]^{2n}}}. \]

At \(\omega=\pi\cdot\text{normal_cutoff}\) the gain is \(1/\sqrt2\) (−3 dB).

Pros

  • Steep attenuation of high-frequency noise
  • Flat pass-band (no ripple)
  • Zero phase when run forward + reverse
  • Well-established in DSP practice

Cons

  • Irrecoverable loss above cutoff
  • Edge artifacts on very short records
  • Requires tuning to each signal’s spectrum
  • Assumes stationarity

When to use it

Noise sits above a fixed frequency and minimal distortion is required below.

References

  • A. V. Oppenheim & R. W. Schafer, Discrete-Time Signal Processing, 3 e, § 7.5 “Butterworth and Chebyshev IIR Design.” [Pearson]
  • S. W. Smith, DSP Guide, ch. 20 “IIR Filters: Butterworth, Chebyshev & Elliptic.” [Free PDF]

Writing Your Own Custom Smoother from Scratch

Custom smoothing algorithms can be added through the custom smoothing API. Here we provide an example of how to write a simple Moving Average smooth, and incorporate it into the dFL GUI via the data_provider ingestion script, alongside a utilities script.

Ingestion scripts in dFL are called Data Providers. To include a custom smoothing algorithm into dFL, one must first define a custom smoothing python dictionary, which can be located under the 'Configuration Dictionaries' section of the Data Provider sript. This dictionary determines the parameters needed for the custom smoother, any delimiters required, as well as the display name of those parameters. An example for the SMA smoother is given by:

=== "Python"

    custom_smoothing_options = {
        "simple_moving_average": {
            "display_name": "Simple Moving Average",
            "parameters": {
                "moving_average_window_size": {"default": 1, "min": 0, "max": None, "display_name": "Window Size"}
            },
            "function": simple_moving_average,
        },
    }

In this case, the configuration dictionary includes a python function that is imported from the utilities script, which will need to be added to the imports at the top of the Data provider script.

=== "Python"

from ga_offline_utilities import simple_moving_average 

In the utilities script is where the python function itself is written; here depending only on numpy. Here we define the function, and any parameters needed for it.

=== "Python"

import numpy as np

def simple_moving_average(_shot_name, _signal_name, raw_signal, _times, parameters):
    """
    Apply simple moving average smoothing to a 1D array.

    This function smooths data using a uniform window and pads edges to maintain
    the original array length. The implementation uses convolution for efficiency.

    Args:
        _shot_name (str): The name of the shot being analyzed (unused)
        _signal_name (str): The name of the signal (unused)
        raw_signal (array-like): Input data to smooth
        _times (array-like): Time array of the shot (unused)
        parameters (dict): Dictionary containing smoothing parameters

    Parameters Dictionary Keys:
        - "moving_average_window_size": int
          Window size for moving average

    Returns:
        numpy.array: Smoothed data with same length as input

    Edge Handling:
        - Uses 'valid' convolution then pads edges with first/last smoothed values
        - Left padding: first smoothed value repeated
        - Right padding: last smoothed value repeated
    """
    # Convert input to numpy array
    data = np.array(raw_signal)
    window_size = parameters["moving_average_window_size"]

    # Create weights for simple moving average
    weights = np.ones(window_size) / window_size

    # Apply convolution
    smoothed = np.convolve(data, weights, mode="valid")

    # Pad the edges to maintain original length
    padding = window_size - 1
    left_pad = np.full(padding // 2, smoothed[0])
    right_pad = np.full(padding - padding // 2, smoothed[-1])

    return np.concatenate([left_pad, smoothed, right_pad])

The result is a simple moving average (SMA) smoother fully integrated into dFL's GUI that works on any multimodal dataset.

Simple Moving Average (SMA)

Definition

\[ s_t=\frac{1}{N}\sum_{i=0}^{N-1}x_{t-i}. \]

A true sliding window—points older than \(N\) samples are ignored.

Name Purpose Effect
window N samples averaged large N → very smooth, small N → detailed

Pros

  • Crystal-clear math — easy to explain
  • Deterministic lag\((N-1)/2\) samples
  • Quick baseline — first look at noise level

Cons

  • Equal weights — blurs peaks and edges
  • More memory than EMA — stores whole window
  • Edge loss — first \(N-1\) outputs undefined

Why choose it

When transparency is critical and compute cost is a non-issue.

References

  • J. Hamilton, Time Series Analysis, § 6.1 “Moving Averages.” [Princeton UP]
  • R. H. Shumway & D. S. Stoffer, Time Series Analysis and Its Applications, § 2.1. [Springer]
  • S. W. Smith, The Scientist & Engineer’s Guide to DSP, ch. 15. [Free PDF]