Iman Deznabi

Researcher at Microsoft Research and Ph.D. candidate in Computer Science at the University of Massachusetts Amherst. Currently, I am working on designing and implementing new adaptive deep-learning models for personalized, multi-resolution and irregularly sampled time-series data. My general interests are in machine learning applications in Health Care, Computational Biology, and Natural Language Processing.

Education

Ph.D. in Computer Science (Sep 2018 - Feb 2025)
University of Massachusetts Amherst, GPA: 4/ 4
Thesis title: Adaptive Deep Learning Models for Personalized Modeling of Heterogeneous Time-Series Data

This is an embedded <a target="_blank" href="https://office.com">Microsoft Office</a> presentation, powered by <a target="_blank" href="https://office.com/webapps">Office</a>.
M.Sc. in Computer Science (Sep 2018 - Dec 2021)
University of Massachusetts Amherst, GPA: 4/ 4
M.Sc. in Computer Engineering (Sep 2015 - Jan 2018)
Bilkent University, GPA: 4/ 4
Thesis title: DeepKinZero: Zero-Shot Learning for Predicting Kinase Phosphorylation Sites
B.Sc. in Information Technology Engineering (Sep 2010 - Feb 2015)
University of Tabriz, GPA: 17.48 /20 (3.67 /4), Last two years GPA:3.95/4
Thesis title: Algorithmic music composition according to human feelings with Hidden Markov Models

Work Experience

Microsoft Research, Part-time Researcher (Feb 2024 - Sep 2024)
Designing and implementing foundation deep-learning models for zero-shot microclimate forecasting, improving forecasting performance by 44\% in areas with no training data using Graph Neural Networks (GNNs) and Retrieval Augmented Generation (RAG) models.
Microsoft, Data Science Intern (Jun 2022 - Aug 2022)
Developed an end-to-end system that forecasts the hourly number of requests on databases and scales them accordingly. This system significantly reduces database costs and reduces the amount of throttled requests.
Kronos Incorporated, Data Science Intern (Jun 2019 - Aug 2019)
Improved hierarchical forecasting of real sales by more than 60% using deep learning models.

Publications

Peer-reviewed

Towards Resolution-Aware Retrieval Augmented Zero-Shot Forecasting
Iman Deznabi, Peeyush Kumar, Madalina Fiterau
Time Series in The Age of Large Models workshop NeurIPS (2024) - spotlight presentation

Zero-shot forecasting predicts variables at locations or conditions without direct historical data, a challenge for traditional methods due to limited location-specific information. We introduce a retrieval-augmented model that leverages spatial correlations and temporal frequencies to enhance predictive accuracy in unmonitored areas. By decomposing signals into different frequencies, the model incorporates external knowledge for improved forecasts. Unlike large foundational time series models, our approach explicitly captures spatial-temporal relationships, enabling more accurate, localized predictions. Applied to microclimate forecasting, our model outperforms traditional and foundational models, offering a more robust solution for zero-shot scenarios.

This is an embedded <a target="_blank" href="https://office.com">Microsoft Office</a> presentation, powered by <a target="_blank" href="https://office.com/webapps">Office</a>.
Dynamic Clustering via Branched Deep Learning Enhances Personalization of Stress Prediction from Mobile Sensor Data
Iman Deznabi, Yunfei Lou, Abhinav Shaw, Natcha Simsiri, Tauhidur Rahman, Madalina Fiterau
Nature Scientific Reports (2024)

College students experience ever-increasing levels of stress, leading to a wide range of health problems. In this context, monitoring and predicting students’ stress levels is crucial and, fortunately, made possible by the growing support for data collection via mobile devices. However, predicting stress levels from mobile phone data remains a challenging task, and off-the-shelf deep learning models are inapplicable or inefficient due to data irregularity, inter-subject variability, and the “cold start problem”. To overcome these challenges, we developed a platform named Branched CALM-Net that aims to predict students’ stress levels through dynamic clustering in a personalized manner. This is the first platform that leverages the branching technique in a multitask setting to achieve personalization and continuous adaptation. Our method achieves state-of-the-art performance in predicting student stress from mobile sensor data collected as part of the Dartmouth StudentLife study, with a ROC AUC 37% higher and a PR AUC surpassing that of the nearest baseline models. In the cold-start online learning setting, Branched CALM-Net outperforms other models, attaining an average F1 score of 87% with just 1 week of training data for a new student, which shows it is reliable and effective at predicting stress levels from mobile data.

This is an embedded <a target="_blank" href="https://office.com">Microsoft Office</a> presentation, powered by <a target="_blank" href="https://office.com/webapps">Office</a>.
Zero-shot micro-climate prediction with deep learning.
Iman Deznabi, Peeyush Kumar, Madalina Fiterau
Tackling Climate Change with Machine Learning workshop NeurIPS (2023)

Weather station data is a valuable resource for climate prediction, however, its reliability can be limited in remote locations. To compound the issue, making local predictions often relies on sensor data that may not be accessible for a new, previously unmonitored location. In response to these challenges, we propose a novel zero-shot learning approach designed to forecast various climate measurements at new and unmonitored locations. Our method surpasses conventional weather forecasting techniques in predicting microclimate variables by leveraging knowledge extracted from other geographic locations.

This is an embedded <a target="_blank" href="https://office.com">Microsoft Office</a> presentation, powered by <a target="_blank" href="https://office.com/webapps">Office</a>.
MultiWave: Multiresolution Deep Architectures through Wavelet Decomposition for Multivariate Timeseries Forecasting and Prediction.
Iman Deznabi, Madalina Fiterau
Conference on Health, Inference, and Learning (CHIL 2023)

The analysis of multivariate time series data is challenging due to the various frequencies of signal changes that can occur over both short and long terms. Furthermore, standard deep learning models are often unsuitable for such datasets, as signals are typically sampled at different rates. To address these issues, we introduce MultiWave, a novel framework that enhances deep learning time series models by incorporating components that operate at the intrinsic frequencies of signals. MultiWave uses wavelets to decompose each signal into subsignals of varying frequencies and groups them into frequency bands. Each frequency band is handled by a different component of our model. A gating mechanism combines the output of the components to produce sparse models that use only specific signals at specific frequencies. Our experiments demonstrate that MultiWave accurately identifies informative frequency bands and improves the performance of various deep learning models, including LSTM, Transformer, and CNN-based models, for a wide range of applications. It attains top performance in stress and affect detection from wearables. It also increases the AUC of the best-performing model by 5% for in-hospital COVID-19 mortality prediction from patient blood samples and for human activity recognition from accelerometer and gyroscope data. We show that MultiWave consistently identifies critical features and their frequency components, thus providing valuable insights into the applications studied.

This is an embedded <a target="_blank" href="https://office.com">Microsoft Office</a> presentation, powered by <a target="_blank" href="https://office.com/webapps">Office</a>.
Population-level inference for home-range areas
Christen H Fleming, Iman Deznabi, Shauhin Alavi, Margaret C Crofoot, Ben T Hirsch, E Patricia Medici, Michael J Noonan, Roland Kays, William F Fagan, Daniel Sheldon, Justin M Calabrese
Methods in Ecology and Evolution journal (2022)

1. Home-range estimates are a common product of animal tracking data, as each range represents the area needed by a given individual. Population-level inference of home-range areas—where multiple individual home ranges are considered to be sampled from a population—is also important to evaluate changes over time, space or covariates such as habitat quality or fragmentation, and for comparative analyses of species averages. Population-level home-range parameters have traditionally been estimated by first assuming that the input tracking data were sampled independently when calculating home ranges via conventional kernel density estimation (KDE) or minimal convex polygon (MCP) methods, and then assuming that those individual home ranges were measured exactly when calculating the population-level estimates. This conventional approach does not account for the temporal autocorrelation that is inherent in modern tracking data, nor for the uncertainties of each individual home-range estimate, which are often large and heterogeneous.
2. Here, we introduce a statistically and computationally efficient framework for the population-level analysis of home-range areas, based on autocorrelated kernel density estimation (AKDE), that can account for variable temporal autocorrelation and estimation uncertainty.
3. We apply our method to empirical examples on lowland tapir Tapirus terrestris, kinkajou Potos flavus, white-nosed coati Nasua narica, white-faced capuchin monkey Cebus capucinus and spider monkey Ateles geoffroyi, and quantify differences between species, environments and sexes.
4. Our approach allows researchers to more accurately compare different populations with different movement behaviours or sampling schedules while retaining statistical precision and power when individual home-range uncertainties vary. Finally, we emphasize the estimation of effect sizes when comparing populations, rather than mere significance tests.
Predicting in-hospital mortality by combining clinical notes with time-series data
Iman Deznabi, Mohit Iyyer, Madalina Fiterau
Association for Computational Linguistics (ACL-IJCNLP 2021) Findings

In intensive care units (ICUs), patient health is monitored through (1) continuous vital signals from various medical devices, and (2) clinical notes consisting of opinions and summaries from doctors which are recorded in electronic health records (EHR). It is difficult to jointly model these two sources of information because clinical notes, unlike vital signals, are collected at irregular intervals and their contents are relatively unstructured. In this paper, we present a model that combines both sources of information about ICU patients to make accurate in-hospital mortality predictions. We apply a fine-tuned BERT model to each of the patient's clinical notes. The resulting embeddings are then combined to obtain the overall embedding for the entire text part of the data. This is then combined with the output of an LSTM model that encodes patients' vital signals. Our model improves upon the state of the art for mortality prediction, attaining an AUC score of 0.9, compared to the previous 0.87, setting a new standard for mortality prediction on the MIMIC III benchmark.

This is an embedded <a target="_blank" href="https://office.com">Microsoft Office</a> presentation, powered by <a target="_blank" href="https://office.com/webapps">Office</a>.
Impact of the COVID-19 Pandemic on the Academic Community: Results from a survey conducted at University of Massachusetts Amherst
Iman Deznabi, Tamanna Motahar, Ali Sarvghad, Madalina Fiterau, Narges Mahyar
ACM (2020), Digital Government: Research and Practive, COVID-19 Commentary

The COVID-19 pandemic has significantly impacted academic life in the United States and beyond. To gain a better understanding of its impact on the academic community, we conducted a large-scale survey at the University of Massachusetts Amherst. We collected multifaceted data from students, staff, and faculty on several aspects of their lives, such as mental and physical health, productivity, and finances. All our respondents expressed mental and physical issues and concerns, such as increased stress and depression levels. Financial difficulties seem to have the most considerable toll on staff and undergraduate students, while productivity challenges were mostly expressed by faculty and graduate students. As universities face many important decisions with respect to mitigating the effects of this pandemic, we present our findings with the intent of shedding light on the challenges faced by various academic groups in the face of the pandemic, calling attention to the differences between groups. We also contribute a discussion highlighting how the results translate to policies for the effective and timely support of the categories of respondents who need them most. Finally, the survey itself, which includes conditional logic allowing for personalized questions, serves as a template for further data collection, facilitating a comparison of the impact on campuses across the United States.
DeepKinZero: zero-shot learning for predicting kinase–phosphosite associations involving understudied kinases
Iman Deznabi, Busra Arabaci, Mehmet Koyutürk, Oznur Tastan
Bioinformatics Journal (2020) Also presented at the ICML 2020 Workshop on Computational Biology

Motivation
Protein phosphorylation is a key regulator of protein function in signal transduction pathways. Kinases are the enzymes that catalyze the phosphorylation of other proteins in a target-specific manner. The dysregulation of phosphorylation is associated with many diseases including cancer. Although the advances in phosphoproteomics enable the identification of phosphosites at the proteome level, most of the phosphoproteome is still in the dark: more than 95% of the reported human phosphosites have no known kinases. Determining which kinase is responsible for phosphorylating a site remains an experimental challenge. Existing computational methods require several examples of known targets of a kinase to make accurate kinase-specific predictions, yet for a large body of kinases, only a few or no target sites are reported.
Results
We present DeepKinZero, the first zero-shot learning approach to predict the kinase acting on a phosphosite for kinases with no known phosphosite information. DeepKinZero transfers knowledge from kinases with many known target phosphosites to those kinases with no known sites through a zero-shot learning model. The kinase-specific positional amino acid preferences are learned using a bidirectional recurrent neural network. We show that DeepKinZero achieves significant improvement in accuracy for kinases with no known phosphosites in comparison to the baseline model and other methods available. By expanding our knowledge on understudied kinases, DeepKinZero can help to chart the phosphoproteome atlas.

This is an embedded <a target="_blank" href="https://office.com">Microsoft Office</a> presentation, powered by <a target="_blank" href="https://office.com/webapps">Office</a>.
Personalized Student Stress Prediction with Deep Multitask Network
Abhinav Shaw, Natcha Simsiri, Iman Deznabi, Madalina Fiterau, Tauhidur Rahaman
ICML 2019, Adaptive and Multitask Learning Workshop

With the growing popularity of wearable devices, the ability to utilize physiological data collected from these devices to predict the wearer's mental state such as mood and stress suggests great clinical applications, yet such a task is extremely challenging. In this paper, we present a general platform for personalized predictive modeling of behavioural states like students' level of stress. Through the use of Auto-encoders and Multitask learning we extend the prediction of stress to both sequences of passive sensor data and high-level covariates. Our model outperforms the state-of-the-art in the prediction of stress level from mobile sensor data, obtaining a 45.6 % improvement in F1 score on the StudentLife dataset.
Multi-resolution Attention with Signal Splitting for Multivariate Time Series Classification
Rheeya Uppaal, Bryon Kucharski, Bhanu Pratap Singh, Iman Deznabi, Madalina Fiterau
ICML 2019, Time-Series Workshop

Real world multivariate time series pose three significant challenges: irregularity in sampling, missing values, and varying sampling frequencies among signals. Recent work for inference on such data aims at solving one of these issues, however a unified model is still lacking. We present a unified method which handles all three: Multi-resolution Attention with Signal Splitting (MASS). Our method is model-agnostic and can be applied to any existing model, significantly boosting predictive performance. MASS uses parallel multi-resolution blocks to model different resolution data streams, in addition to splitting signals into components of specific resolutions, to provide approximately a 3% improvement on the Physionet Challenge 2012 Dataset. We also compare to the state of the art TBM and GRU-D models, showcasing promising results against them.
An Inference Attack on Genomic Data Using Kinship, Complex Correlations, and Phenotype Information
Iman Deznabi, Mohammad Mobayen, Nazanin Jafari, Oznur Tastan, Erman Ayday
IEEE/ACM Transactions on Computational Biology and Bioinformatics (2017)

Individuals (and their family members) share (partial) genomic data on public platforms. However, using special characteristics of genomic data, background knowledge that can be obtained from the Web, and family relationship between the individuals, it is possible to infer the hidden parts of shared (and unshared) genomes. Existing work in this field considers simple correlations in the genome (as well as Mendel's law and partial genomes of a victim and his family members). In this paper, we improve the existing work on inference attacks on genomic privacy. We mainly consider complex correlations in the genome by using an observable Markov model and recombination model between the haplotypes. We also utilize the phenotype information about the victims. We propose an efficient message passing algorithm to consider all aforementioned background information for the inference. We show that the proposed framework improves inference with significantly less information compared to existing work.
MEMNAR: Finding Mutually Exclusive Mutation Sets through Negative Association Rule Mining
Iman Deznabi, Ahmet Alparslan Celik, Oznur Tastan
ISMB/ECCB Workshop on Machine Learning in Systems Biology (2017)

It has been reported in multiple cancers that a certain set of gene mutations tend not to occur concurrently in the same patient. This mutual exclusivity pattern hints at a functional relation and can help uncover cancer-driver alterations. We address the problem of discovering mutually exclusive mutation gene sets through mining negative association rules. Our proposed algorithm, MEMNAR, efficiently mines for negative association rules in patient mutation data and constructs mutually exclusive gene sets based on these extracted rules with high accuracy. We also define and detect more complex mutual exclusivity patterns that have not been addressed in earlier approaches. Evaluations on simulated datasets demonstrate that MEMNAR can discover mutually exclusive gene sets faster with improved accuracy compared to the state-of-the-art methods. When we apply MEMNAR on breast cancer, we identify several mutually exclusive gene sets that are biologically relevant and some of which have not been reported in the literature.

Technical Reports

Multi-resolution Networks For Flexible Irregular Time Series Modeling (Multi-FIT)
Bhanu Pratap Singh, Iman Deznabi, Bharath Narasimhan, Bryon Kucharski, Rheeya Uppaal, Akhila Josyula, Madalina Fiterau

Missing values, irregularly collected samples, and multi-resolution signals commonly occur in multivariate time series data, making predictive tasks difficult. These challenges are especially prevalent in the healthcare domain, where patients' vital signs and electronic records are collected at different frequencies and have occasionally missing information due to the imperfections in equipment or patient circumstances. Researchers have handled each of these issues differently, often handling missing data through mean value imputation and then using sequence models over the multivariate signals while ignoring the different resolution of signals. We propose a unified model named Multi-resolution Flexible Irregular Time series Network (Multi-FIT). The building block for Multi-FIT is the FIT network. The FIT network creates an informative dense representation at each time step using signal information such as last observed value, time difference since the last observed time stamp and overall mean for the signal. Vertical FIT (FIT-V) is a variant of FIT which also models the relationship between different temporal signals while creating the informative dense representations for the signal. The multi-FIT model uses multiple FIT networks for sets of signals with different resolutions, further facilitating the construction of flexible representations. Our model has three main contributions: a.) it does not impute values but rather creates informative representations to provide flexibility to the model for creating task-specific representations b.) it models the relationship between different signals in the form of support signals c.) it models different resolutions in parallel before merging them for the final prediction task. The FIT, FIT-V and Multi-FIT networks improve upon the state-of-the-art models for three predictive tasks, including the forecasting of patient survival.

Selected Projects

SpaceX Rocket Lander

Environment: Python, Gym
Keywords: Reinforcement Learning, RNN, LSTM, Proximal Policy Optimization (PPO)

Inference attack on human genome

Environment: Matlab
Keywords: Matlab, Belief propagation, Machine Learning, Genomic Privacy

Automatic Music Composer

Environment: C#, SQL server.
Keywords: Machine Learning, Hidden Markov Models

Life Experiment

Environment: C#, BrainNet
Keywords: Multi-agent Reinforcement Learning, Reinforcement Learning, Neural Networks

Iman Deznabi

iman at cs.umass.edu

Computer Science Department
University of Massachusetts Amherst
Amherst, MA 01003, US

Google Scholar

Education

Work Experience

Publications

Selected Projects

Recent News