Data Fill Options¶
Missing samples (NaNs, gaps, empty rows) distort statistics, gradients, and model training. dFL offers three GUI-selectable gap-filling strategies, balancing speed, bias, and edge handling. The fill options are available under "Graph Controls".
1 · Linear Interpolation (Interior/Middle Gaps)¶
Definition
\[
x_t^\star \;=\; x_{t_0} \;+\; \frac{t - t_0}{t_1 - t_0}\,\bigl(x_{t_1}-x_{t_0}\bigr),
\qquad t_0 < t < t_1,\; x_{t_0},x_{t_1}\neq\text{NaN}.
\]
Pros¶
- Shape-preserving — no abrupt jumps at fill points
- First-order accuracy — exact for linear trends
- Fast — 𝑂(k) where k = number of NaNs
Cons¶
- Edge NaNs untouched — needs valid points on both sides
- Ignores curvature — underestimates peaks/valleys in nonlinear zones.
- Not robust to outliers — extreme neighbors propagate error
Typical uses¶
Sensor drop-outs, telemetry hiccups, short gaps in quasi-linear signals.
2 · Constant Edge Extension¶
Definition
\[
x_t^\star =
\begin{cases}
x_{t_{\text{first}}}, & t < t_{\text{first}},\\[6pt]
x_{t_{\text{last}}}, & t > t_{\text{last}}.
\end{cases}
\]
Pros¶
- Simple & causal — suits real-time streaming
- Maintains level — no artificial ramps at boundaries
- Prevents NaN leakage — finite values for entire record
Cons¶
- Step artifact — constant segments may look unnatural
- Bias risk — repeats stale value indefinitely
- Interior NaNs ignored — use only for edges
Typical uses¶
Padding before filtering/FFT, real-time displays, algorithms that forbid NaNs.
3 · Hybrid (Edge Extend + Linear Interpolate)¶
Definition
- Edge step: fill leading/trailing NaNs with constant extension.
- Interior step: linearly interpolate remaining internal gaps.
Pros¶
- Full coverage — no NaNs survive
- Smooth interior, stable edges — best of both worlds
- Compatible with downstream stats & ML
Cons¶
- Two-phase cost — slight extra compute
- Same assumptions — inherits linear-gap and constant-edge caveats
Typical uses¶
General-purpose default when data have both edge drop-outs and internal gaps.