- PreprintFast Nonlinear Vector Quantile RegressionAviv A. Rosenberg, Sanketh Vedula, Yaniv Romano, and Alex M. BronsteinMay 2022
Quantile regression (QR) is a powerful tool for estimating one or more conditional quantiles of a target variable Y given explanatory features X. A limitation of QR is that it is only defined for scalar target variables, due to the formulation of its objective function, and since the notion of quantiles has no standard definition for multivariate distributions. Recently, vector quantile regression (VQR) was proposed as an extension of QR for high-dimensional target variables, thanks to a meaningful generalization of the notion of quantiles to multivariate distributions. Despite its elegance, VQR is arguably not applicable in practice due to several limitations: (i) it assumes a linear model for the quantiles of the target Y given the features X; (ii) its exact formulation is intractable even for modestly-sized problems in terms of target dimensions, number of regressed quantile levels, or number of features, and its relaxed dual formulation may violate the monotonicity of the estimated quantiles; (iii) no fast or scalable solvers for VQR currently exist. In this work we fully address these limitations, namely: (i) We extend VQR to the non-linear case, showing substantial improvement over linear VQR; (ii) We propose vector monotone rearrangement, a method which ensures the estimates obtained by VQR relaxations are monotone functions; (iii) We provide fast, GPU-accelerated solvers for linear and nonlinear VQR which maintain a fixed memory footprint with number of samples and quantile levels, and demonstrate that they scale to millions of samples and thousands of quantile levels; (iv) We release an optimized python package of our solvers as to widespread the use of VQR in real-world applications.
- NatCommCodon-specific Ramachandran plots show amino acid backbone conformation depends on identity of the translated codonAviv A. Rosenberg, Ailie Marx, and Alex M. BronsteinNature Communications May 2022
Synonymous codons translate into chemically identical amino acids. Once considered inconsequential to the formation of the protein product, there is evidence to suggest that codon usage affects co-translational protein folding and the final structure of the expressed protein. Here we develop a method for computing and comparing codon-specific Ramachandran plots and demonstrate that the backbone dihedral angle distributions of some synonymous codons are distinguishable with statistical significance for some secondary structures. This shows that there exists a dependence between codon identity and backbone torsion of the translated amino acid. Although these findings cannot pinpoint the causal direction of this dependence, we discuss the vast biological implications should coding be shown to directly shape protein conformation and demonstrate the usefulness of this method as a tool for probing associations between codon usage and protein structure. Finally, we urge for the inclusion of exact genetic information into structural databases.
- PNASMeeting the unmet needs of clinicians from AI systems showcased for cardiology with deep-learning–based ECG analysisYonatan Elul, Aviv A. Rosenberg, Assaf Schuster, Alex M. Bronstein, and Yael YanivProceedings of the National Academy of Sciences May 2021
The use of artificial intelligence (AI) in medicine, particularly deep learning, has gained considerable attention recently. Although some works boast superior capabilities compared to clinicians, actual deployments of AI systems in the clinic are scarce. We describe four important gaps on the machine-learning side responsible for this discrepancy by first formulating them in a way that is actionable by AI researchers and then systematically addressing these needs. Aiming beyond the search for better model architectures or improved accuracy, we focus directly on the challenges of clinical usefulness as stated by medical professionals in the literature. Our results show that deep-learning systems can be robust, trustworthy, explainable, and transparent while retaining the superior level of performance these algorithms are known for.Despite their great promise, artificial intelligence (AI) systems have yet to become ubiquitous in the daily practice of medicine largely due to several crucial unmet needs of healthcare practitioners. These include lack of explanations in clinically meaningful terms, handling the presence of unknown medical conditions, and transparency regarding the system’s limitations, both in terms of statistical performance as well as recognizing situations for which the system’s predictions are irrelevant. We articulate these unmet clinical needs as machine-learning (ML) problems and systematically address them with cutting-edge ML techniques. We focus on electrocardiogram (ECG) analysis as an example domain in which AI has great potential and tackle two challenging tasks: the detection of a heterogeneous mix of known and unknown arrhythmias from ECG and the identification of underlying cardio-pathology from segments annotated as normal sinus rhythm recorded in patients with an intermittent arrhythmia. We validate our methods by simulating a screening for arrhythmias in a large-scale population while adhering to statistical significance requirements. Specifically, our system 1) visualizes the relative importance of each part of an ECG segment for the final model decision; 2) upholds specified statistical constraints on its out-of-sample performance and provides uncertainty estimation for its predictions; 3) handles inputs containing unknown rhythm types; and 4) handles data from unseen patients while also flagging cases in which the model’s outputs are not usable for a specific patient. This work represents a significant step toward overcoming the limitations currently impeding the integration of AI into clinical practice in cardiology and medicine in general.All study data are included in the article and/or SI Appendix.
- Nat. Dig. Med.Digital oximetry biomarkers for assessing respiratory function: standards of measurement, physiological interpretation, and clinical useJeremy Levy, Daniel Álvarez, Aviv A Rosenberg, Alexandra Alexandrovich, Félix Del Campo, and 1 more authorNPJ Digital Medicine May 2021
Pulse oximetry is routinely used to non-invasively monitor oxygen saturation levels. A low oxygen level in the blood means low oxygen in the tissues, which can ultimately lead to organ failure. Yet, contrary to heart rate variability measures, a field which has seen the development of stable standards and advanced toolboxes and software, no such standards and open tools exist for continuous oxygen saturation time series variability analysis. The primary objective of this research was to identify, implement and validate key digital oximetry biomarkers (OBMs) for the purpose of creating a standard and associated reference toolbox for continuous oximetry time series analysis. We review the sleep medicine literature to identify clinically relevant OBMs. We implement these biomarkers and demonstrate their clinical value within the context of obstructive sleep apnea (OSA) diagnosis on a total of n = 3806 individual polysomnography recordings totaling 26,686 h of continuous data. A total of 44 digital oximetry biomarkers were implemented. Reference ranges for each biomarker are provided for individuals with mild, moderate, and severe OSA and for non-OSA recordings. Linear regression analysis between biomarkers and the apnea hypopnea index (AHI) showed a high correlation, which reached 𝑅⎯⎯⎯⎯2=0.82. The resulting python OBM toolbox, denoted “pobm”, was contributed to the open software PhysioZoo (physiozoo.org). Studying the variability of the continuous oxygen saturation time series using pbom may provide information on the underlying physiological control systems and enhance our understanding of the manifestations and etiology of diseases, with emphasis on respiratory diseases.
- FrontiersOpening the Schrödinger Box: Short-and Long-Range Mammalian Heart Rate VariabilityIdo Weiser-Bitoun, Moran Davoodi, Aviv A Rosenberg, Alexandra Alexandrovich, and Yael YanivFrontiers in physiology May 2021
- Nat. Sci. Rep.Signatures of the autonomic nervous system and the heart’s pacemaker cells in canine electrocardiograms and their applications to humansAviv A Rosenberg, Ido Weiser-Bitoun, George E Billman, and Yael YanivNature Scientific Reports May 2020
Heart rate and heart rate variability (HRV) are mainly determined by the autonomic nervous system (ANS), which interacts with receptors on the sinoatrial node (SAN; the heart’s primary pacemaker), and by the “coupled-clock” system within the SAN cells. HRV changes are associated with cardiac diseases. However, the relative contributions of the ANS and SAN to HRV are not clear, impeding effective treatment. To discern the SAN’s contribution, we performed HRV analysis on canine electrocardiograms containing basal and ANS-blockade segments. We also analyzed human electrocardiograms of atrial fibrillation and heart failure patients, as well as healthy aged subjects. Finally, we used a mathematical model to simulate HRV under decreased “coupled-clock” regulation. We found that (a) in canines, the SAN and ANS contribute mainly to long- and short-term HRV, respectively; (b) there is evidence suggesting a similar relative SAN contribution in humans; (c) SAN features can be calculated from beat-intervals obtained in-vivo, without intervention; (d) ANS contribution can be modeled by sines embedded in white noise; (e) HRV changes associated with cardiac diseases and aging can be interpreted as deterioration of both SAN and ANS; and (f) SAN clock-coupling can be estimated from changes in HRV. This may enable future non-invasive diagnostic applications.
- CinCAdding Two Dimensions to Heart Rate Variability ResearchJoachim A Behar, Ori Shemla, Ido Weiser-Bitoun, Aviv A Rosenberg, and Yael YanivIn 2018 Computing in Cardiology Conference (CinC) May 2018
Introduction: Heart rate variability (HRV) analysis tools have been mainly available for analysis of human electrocardiographic derived heart rate. We explore extending HRV analysis to two additional dimensions: (1) analysis across multiple mammalian species and (2) analysis across different levels of integration for example sinoatrial tissue. Methods: We analyzed the beating rate variability (BRV) across the two additional dimensions using the PhysioZoo computer program that we recently introduced. We used published databases of electrocardiograms from four mammal types: human (n=18), dog (n=17), rabbit (n=4) and mouse (n=8). We computed the BRV measures for each. We also show how the PhysioZoo program can be used for the analysis of sinoatrial node tissue BRV. Results: The study of typical mammalian heart and respiration rates (obtained from the dominant high frequency peak) revealed a linear relationship between these two quantities. Analysis of the rabbit sinoatrial node tissue BRV showed that it had reduced overall variability when compared to in vivo heart BRV.
- MSc. ThesisNon-Invasive In-Vivo Analysis of Intrinsic Clock-Like Pacemaker Mechanisms: Decoupling Neural Input Using Heart Rate Variability MeasurementsMay 2018
Heart diseases account for a quarter of all deaths each year in the US and are also an economic burden with an estimated expenditure for treatment of almost $100B annually in the US alone. Cardiovascular disease mortality rate is correlated with an increase in heart rate which is regulated by both the autonomic nervous system (ANS) and the sinoatrial node (SAN) cells in heart. The heart rate is highly variable and never reaches a steady state even at rest—a phenomenon known as heart rate variability (HRV). Many studies have shown that loss of this variability is strongly associated with morbidity and mortality. By using pharmacological denervation, a method of temporarily blocking the ANS and applying HRV analysis we aim to study the contribution of the SAN to the HRV. We acquired canine ECG data containing both basal (n=27) and denervated segments (n=20). We applied an automated ECG segmentation algorithm to extract the segments from each record. We used a custom R-peak detector, rqrs, based on the PhysioNet’s gqrs, to detect R-peaks in the data and produce an RR-interval time series. We excluded ectopic beats using an automated algorithm and proceeded to apply HRV analysis to the resulting intervals. We implemented all major HRV techniques which can be categorized into time domain, frequency domain (spectral) and nonlinear methods (which quantify physiological complexity). We used these methods to extract HRV features from both the basal and denervated data sets. We implemented all signal processing and HRV analysis methods as an open source MATLAB toolbox, rhrv, and additionally provided a GUI, PhysioZoo, which enables HRV analysis in animal data and comes with an annotated animal database. We have shown that the rqrs peak detector provides accurate detections for annotated human ECG data (F1=93.4) and annotated ECG records from our canine dataset (F1=98.7). Moreover, we adapted HRV analysis techniques to the canine data where necessary and e.g. provide an automatic method of adapting the frequency bands for spectral HRV analysis. HRV analysis of basal vs. denervated data shows that (1) Time domain HRV is significantly reduced after denervation; (2) SAN contributes spectral power mainly in the very-low frequency band; (3) The SA Node contributes most of the physiological complexity of the heart rate, specifically the long-term changes occurring over many beats; (4) The ANS influences mainly the short term, beat-to-beat variability of the heart rate; (5) The contribution of the ANS to the heart rate signal can be modeled as two sine waves at specific frequencies corresponding to periodic autonomic regulation embedded in white noise. Moreover, we suggest clinical indices for the state and function of the SAN directly from basal ECG data by measuring spectral power in the VLF band and multiscale entropy (MSE) values in the high scales. We conclude that by applying HRV analysis to regular ECG data, SAN function can be observed even without pharmacological denervation. This has the potential to allow future non-invasive heart monitoring solutions that can be used e.g. for early detection of SA node dysfunction.
- FrontiersPhysioZoo: a novel open access platform for heart rate variability analysis of mammalian electrocardiographic dataJoachim A Behar, Aviv A Rosenberg, Ido Weiser-Bitoun, Ori Shemla, Alexandra Alexandrovich, and 2 more authorsFrontiers in Physiology May 2018
- FrontiersA universal scaling relation for defining power spectral bands in mammalian heart rate variability analysisJoachim A Behar, Aviv A Rosenberg, Ori Shemla, Kevin R Murphy, Gideon Koren, and 2 more authorsFrontiers in Physiology May 2018
- CinCRhythm and quality classification from short ECGs recorded using a mobile deviceJoachim A Behar, Aviv A Rosenberg, Yael Yaniv, and Julien OsterIn 2017 Computing in Cardiology (CinC) May 2017
Introduction: Atrial fibrillation (AF) is the most common sustained cardiac arrhythmia. Its prevalence is 12% of the general population and it is associated with increased risk of mortality and morbidity. Methods: The AliveCor mobile electrocardiogram (ECG) device was used to collect data. The Physionet Challenge aimed to create an intelligent algorithm for automated rhythm and quality classification. A database of 8528 single lead ECG was used for training and a closed database of 3658 ECG recordings was used for testing the participants algorithms on the Challenge server. The RR interval time-series was first estimated using a R-peak detector. Signal quality was estimated on a second-by-second basis and the continuous sub-segment with the highest quality was selected for further analysis. A number of features were estimated: heart rate variability (time domain based, fragmentation, coefficient of sample entropy etc.), ECG morphology (QRS length, QT interval etc.) and the presence of ectopic beats. The features were used to train support vector machine classifiers in a one-vs.-rest approach. Results: For the final score of the challenge we obtained an overall F ι measure on the test set of 0.80. Conclusion: The feature based machine learning approach showed high performance in distinguishing between the different rhythms represented in the Challenge. This opens the horizon for computer automated interpretation of single lead mobile ECG.