Biosignal Compression Toolbox for Digital Biomarker Discovery.

Abstract

A critical challenge to using longitudinal wearable sensor biosignal data for healthcare applications and digital biomarker development is the exacerbation of the healthcare "data deluge," leading to new data storage and organization challenges and costs. Data aggregation, sampling rate minimization, and effective data compression are all methods for consolidating wearable sensor data to reduce data volumes. There has been limited research on appropriate, effective, and efficient data compression methods for biosignal data. Here, we examine the application of different data compression pipelines built using combinations of algorithmic- and encoding-based methods to biosignal data from wearable sensors and explore how these implementations affect data recoverability and storage footprint. Algorithmic methods tested include singular value decomposition, the discrete cosine transform, and the biorthogonal discrete wavelet transform. Encoding methods tested include run-length encoding and Huffman encoding. We apply these methods to common wearable sensor data, including electrocardiogram (ECG), photoplethysmography (PPG), accelerometry, electrodermal activity (EDA), and skin temperature measurements. Of the methods examined in this study and in line with the characteristics of the different data types, we recommend direct data compression with Huffman encoding for ECG, and PPG, singular value decomposition with Huffman encoding for EDA and accelerometry, and the biorthogonal discrete wavelet transform with Huffman encoding for skin temperature to maximize data recoverability after compression. We also report the best methods for maximizing the compression ratio. Finally, we develop and document open-source code and data for each compression method tested here, which can be accessed through the Digital Biomarker Discovery Pipeline as the "Biosignal Data Compression Toolbox," an open-source, accessible software platform for compressing biosignal data.

DOI
10.3390/s21020516
Year