Skip to content

Data Fill Options

Missing samples (NaNs, gaps, empty rows) distort statistics, gradients, and model training. dFL offers three GUI-selectable gap-filling strategies, balancing speed, bias, and edge handling. The fill options are available under "Graph Controls".


1 · Linear Interpolation (Interior/Middle Gaps)

Definition

\[ x_t^\star \;=\; x_{t_0} \;+\; \frac{t - t_0}{t_1 - t_0}\,\bigl(x_{t_1}-x_{t_0}\bigr), \qquad t_0 < t < t_1,\; x_{t_0},x_{t_1}\neq\text{NaN}. \]

Pros

  • Shape-preserving — no abrupt jumps at fill points
  • First-order accuracy — exact for linear trends
  • Fast — 𝑂(k) where k = number of NaNs

Cons

  • Edge NaNs untouched — needs valid points on both sides
  • Ignores curvature — underestimates peaks/valleys in nonlinear zones.
  • Not robust to outliers — extreme neighbors propagate error

Typical uses

Sensor drop-outs, telemetry hiccups, short gaps in quasi-linear signals.


2 · Constant Edge Extension

Definition

\[ x_t^\star = \begin{cases} x_{t_{\text{first}}}, & t < t_{\text{first}},\\[6pt] x_{t_{\text{last}}}, & t > t_{\text{last}}. \end{cases} \]

Pros

  • Simple & causal — suits real-time streaming
  • Maintains level — no artificial ramps at boundaries
  • Prevents NaN leakage — finite values for entire record

Cons

  • Step artifact — constant segments may look unnatural
  • Bias risk — repeats stale value indefinitely
  • Interior NaNs ignored — use only for edges

Typical uses

Padding before filtering/FFT, real-time displays, algorithms that forbid NaNs.


3 · Hybrid (Edge Extend + Linear Interpolate)

Definition

  1. Edge step: fill leading/trailing NaNs with constant extension.
  2. Interior step: linearly interpolate remaining internal gaps.

Pros

  • Full coverage — no NaNs survive
  • Smooth interior, stable edges — best of both worlds
  • Compatible with downstream stats & ML

Cons

  • Two-phase cost — slight extra compute
  • Same assumptions — inherits linear-gap and constant-edge caveats

Typical uses

General-purpose default when data have both edge drop-outs and internal gaps.


References & Further Reading

  • S. van Buuren, Flexible Imputation of Missing Data, 2 e. CRC Press
  • pandas documentation — Series.interpolate. Link
  • SciPy Interpolation Tutorial. Link