Data Harmonization¶
Data harmonization prepares diverse, noisy, and incomplete datasets for reliable analysis by applying a structured sequence of preprocessing steps. Data harmonization is a complex and subtle data preparation step, and may require extensive customization from a user with a specific need. As a consequence, dFL is designed for users to include their own custom preprocessing ools, requiring simple python scripting to fully integrate preprocessing steps, such as custom normalizations, resamplers, etc., into the dFL GUI. These customizations are discussed in the sections below as well.
The dFL GUI natively supports the following harmonization steps:
1. Data Trim¶
Trimming reduces large datasets to a smaller segment for faster visualization and preprocessing.
2. Data Fill¶
Automatically replaces NaN or empty entries with valid values to maintain continuity in signals.
3. Resampling¶
Adjusts the data sampling rate for consistency across signals.
4. Smoothing¶
Reduces high-frequency noise to reveal meaningful trends.
5. Normalization¶
Ensures comparable feature scales.
6. Order of Operations¶
Controlling the order of operations in the above 1-4 is essential for assuring downstream reproducibility.
7. Custom Filters/Transformations¶
Custom filters/transforamtions can be readily added to the dFL GUI using the data provider and fetch data scripting.