Researcher at Microsoft Research and Ph.D. candidate in Computer Science at the University of Massachusetts Amherst. Currently, I am working on designing and implementing new adaptive deep-learning models for personalized, multi-resolution and irregularly sampled time-series data. My general interests are in machine learning applications in Health Care, Computational Biology, and Natural Language Processing.
Education
Ph.D. in Computer Science (Sep 2018 - Jan 2025) University of Massachusetts Amherst, GPA: 4/ 4 Thesis title: Adaptive Deep Learning Models for Personalized Modeling of Heterogeneous Time-Series Data
M.Sc. in Computer Science (Sep 2018 - Dec 2021) University of Massachusetts Amherst, GPA: 4/ 4
M.Sc. in Computer Engineering (Sep 2015 - Jan 2018) Bilkent University, GPA: 4/ 4 Thesis title: DeepKinZero: Zero-Shot Learning for Predicting Kinase Phosphorylation Sites
B.Sc. in Information Technology Engineering (Sep 2010 - Feb 2015) University of Tabriz, GPA: 17.48 /20 (3.67 /4), Last two years GPA:3.95/4 Thesis title: Algorithmic music composition according to human feelings with Hidden Markov Models
Work Experience
Microsoft Research, Part-time Researcher (Feb 2024 - Sep 2024) Designing and implementing foundation deep-learning models for zero-shot microclimate forecasting, improving forecasting performance by 44\% in areas with no training data using Graph Neural Networks (GNNs) and Retrieval Augmented Generation (RAG) models.
Microsoft, Data Science Intern (Jun 2022 - Aug 2022) Developed an end-to-end system that forecasts the hourly number of requests on databases and scales them accordingly. This system significantly reduces database costs and reduces the amount of throttled requests.
Kronos Incorporated, Data Science Intern (Jun 2019 - Aug 2019) Improved hierarchical forecasting of real sales by more than 60% using deep learning models.
Zero-shot forecasting predicts variables at locations or conditions without direct historical data, a challenge for traditional methods due to limited location-specific information. We introduce a retrieval-augmented model that leverages spatial correlations and temporal frequencies to enhance predictive accuracy in unmonitored areas. By decomposing signals into different frequencies, the model incorporates external knowledge for improved forecasts. Unlike large foundational time series models, our approach explicitly captures spatial-temporal relationships, enabling more accurate, localized predictions. Applied to microclimate forecasting, our model outperforms traditional and foundational models, offering a more robust solution for zero-shot scenarios.
College students experience ever-increasing levels of stress, leading to a wide range of health problems. In this context, monitoring and predicting students’ stress levels is crucial and, fortunately, made possible by the growing support for data collection via mobile devices. However, predicting stress levels from mobile phone data remains a challenging task, and off-the-shelf deep learning models are inapplicable or inefficient due to data irregularity, inter-subject variability, and the “cold start problem”. To overcome these challenges, we developed a platform named Branched CALM-Net that aims to predict students’ stress levels through dynamic clustering in a personalized manner. This is the first platform that leverages the branching technique in a multitask setting to achieve personalization and continuous adaptation. Our method achieves state-of-the-art performance in predicting student stress from mobile sensor data collected as part of the Dartmouth StudentLife study, with a ROC AUC 37% higher and a PR AUC surpassing that of the nearest baseline models. In the cold-start online learning setting, Branched CALM-Net outperforms other models, attaining an average F1 score of 87% with just 1 week of training data for a new student, which shows it is reliable and effective at predicting stress levels from mobile data.
Weather station data is a valuable resource for climate prediction, however, its reliability can be limited in remote locations. To compound the issue, making local predictions often relies on sensor data that may not be accessible for a new, previously unmonitored location. In response to these challenges, we propose a novel zero-shot learning approach designed to forecast various climate measurements at new and unmonitored locations. Our method surpasses conventional weather forecasting techniques in predicting microclimate variables by leveraging knowledge extracted from other geographic locations.
The analysis of multivariate time series data is challenging due to the various frequencies of signal changes that can occur over both short and long terms. Furthermore, standard deep learning models are often unsuitable for such datasets, as signals are typically sampled at different rates. To address these issues, we introduce MultiWave, a novel framework that enhances deep learning time series models by incorporating components that operate at the intrinsic frequencies of signals. MultiWave uses wavelets to decompose each signal into subsignals of varying frequencies and groups them into frequency bands. Each frequency band is handled by a different component of our model. A gating mechanism combines the output of the components to produce sparse models that use only specific signals at specific frequencies. Our experiments demonstrate that MultiWave accurately identifies informative frequency bands and improves the performance of various deep learning models, including LSTM, Transformer, and CNN-based models, for a wide range of applications. It attains top performance in stress and affect detection from wearables. It also increases the AUC of the best-performing model by 5% for in-hospital COVID-19 mortality prediction from patient blood samples and for human activity recognition from accelerometer and gyroscope data. We show that MultiWave consistently identifies critical features and their frequency components, thus providing valuable insights into the applications studied.
Population-level inference for home-range areas
Christen H Fleming, Iman Deznabi, Shauhin Alavi, Margaret C Crofoot, Ben T Hirsch, E Patricia Medici, Michael J Noonan, Roland Kays, William F Fagan, Daniel Sheldon, Justin M Calabrese
Methods in Ecology and Evolution journal (2022)
1. Home-range estimates are a common product of animal tracking data, as each range represents the area needed by a given individual. Population-level inference of home-range areas—where multiple individual home ranges are considered to be sampled from a population—is also important to evaluate changes over time, space or covariates such as habitat quality or fragmentation, and for comparative analyses of species averages. Population-level home-range parameters have traditionally been estimated by first assuming that the input tracking data were sampled independently when calculating home ranges via conventional kernel density estimation (KDE) or minimal convex polygon (MCP) methods, and then assuming that those individual home ranges were measured exactly when calculating the population-level estimates. This conventional approach does not account for the temporal autocorrelation that is inherent in modern tracking data, nor for the uncertainties of each individual home-range estimate, which are often large and heterogeneous.
2. Here, we introduce a statistically and computationally efficient framework for the population-level analysis of home-range areas, based on autocorrelated kernel density estimation (AKDE), that can account for variable temporal autocorrelation and estimation uncertainty.
3. We apply our method to empirical examples on lowland tapir Tapirus terrestris, kinkajou Potos flavus, white-nosed coati Nasua narica, white-faced capuchin monkey Cebus capucinus and spider monkey Ateles geoffroyi, and quantify differences between species, environments and sexes.
4. Our approach allows researchers to more accurately compare different populations with different movement behaviours or sampling schedules while retaining statistical precision and power when individual home-range uncertainties vary. Finally, we emphasize the estimation of effect sizes when comparing populations, rather than mere significance tests.
In intensive care units (ICUs), patient health is monitored through (1) continuous vital signals from various medical devices, and (2) clinical notes consisting of opinions and summaries from doctors which are recorded in electronic health records (EHR). It is difficult to jointly model these two sources of information because clinical notes, unlike vital signals, are collected at irregular intervals and their contents are relatively unstructured. In this paper, we present a model that combines both sources of information about ICU patients to make accurate in-hospital mortality predictions. We apply a fine-tuned BERT model to each of the patient's clinical notes. The resulting embeddings are then combined to obtain the overall embedding for the entire text part of the data. This is then combined with the output of an LSTM model that encodes patients' vital signals. Our model improves upon the state of the art for mortality prediction, attaining an AUC score of 0.9, compared to the previous 0.87, setting a new standard for mortality prediction on the MIMIC III benchmark.
The COVID-19 pandemic has significantly impacted academic life in the United States and beyond. To gain a better understanding of its impact on the academic community, we conducted a large-scale survey at the University of Massachusetts Amherst. We collected multifaceted data from students, staff, and faculty on several aspects of their lives, such as mental and physical health, productivity, and finances. All our respondents expressed mental and physical issues and concerns, such as increased stress and depression levels. Financial difficulties seem to have the most considerable toll on staff and undergraduate students, while productivity challenges were mostly expressed by faculty and graduate students. As universities face many important decisions with respect to mitigating the effects of this pandemic, we present our findings with the intent of shedding light on the challenges faced by various academic groups in the face of the pandemic, calling attention to the differences between groups. We also contribute a discussion highlighting how the results translate to policies for the effective and timely support of the categories of respondents who need them most. Finally, the survey itself, which includes conditional logic allowing for personalized questions, serves as a template for further data collection, facilitating a comparison of the impact on campuses across the United States.
Motivation
Protein phosphorylation is a key regulator of protein function in signal transduction pathways. Kinases are the enzymes that catalyze the phosphorylation of other proteins in a target-specific manner. The dysregulation of phosphorylation is associated with many diseases including cancer. Although the advances in phosphoproteomics enable the identification of phosphosites at the proteome level, most of the phosphoproteome is still in the dark: more than 95% of the reported human phosphosites have no known kinases. Determining which kinase is responsible for phosphorylating a site remains an experimental challenge. Existing computational methods require several examples of known targets of a kinase to make accurate kinase-specific predictions, yet for a large body of kinases, only a few or no target sites are reported. Results
We present DeepKinZero, the first zero-shot learning approach to predict the kinase acting on a phosphosite for kinases with no known phosphosite information. DeepKinZero transfers knowledge from kinases with many known target phosphosites to those kinases with no known sites through a zero-shot learning model. The kinase-specific positional amino acid preferences are learned using a bidirectional recurrent neural network. We show that DeepKinZero achieves significant improvement in accuracy for kinases with no known phosphosites in comparison to the baseline model and other methods available. By expanding our knowledge on understudied kinases, DeepKinZero can help to chart the phosphoproteome atlas.
With the growing popularity of wearable devices, the ability to utilize physiological data collected from these devices to predict the wearer's mental state such as mood and stress suggests great clinical applications, yet such a task is extremely challenging. In this paper, we present a general platform for personalized predictive modeling of behavioural states like students' level of stress. Through the use of Auto-encoders and Multitask learning we extend the prediction of stress to both sequences of passive sensor data and high-level covariates. Our model outperforms the state-of-the-art in the prediction of stress level from mobile sensor data, obtaining a 45.6 % improvement in F1 score on the StudentLife dataset.
Real world multivariate time series pose three significant challenges: irregularity in sampling, missing values, and varying sampling frequencies among signals. Recent work for inference on such data aims at solving one of these issues, however a unified model is still lacking. We present a unified method which handles all three: Multi-resolution Attention with Signal Splitting (MASS). Our method is model-agnostic and can be applied to any existing model, significantly boosting predictive performance. MASS uses parallel multi-resolution blocks to model different resolution data streams, in addition to splitting signals into components of specific resolutions, to provide approximately a 3% improvement on the Physionet Challenge 2012 Dataset. We also compare to the state of the art TBM and GRU-D models, showcasing promising results against them.
Individuals (and their family members) share (partial) genomic data on public platforms. However, using special characteristics of genomic data, background knowledge that can be obtained from the Web, and family relationship between the individuals, it is possible to infer the hidden parts of shared (and unshared) genomes. Existing work in this field considers simple correlations in the genome (as well as Mendel's law and partial genomes of a victim and his family members). In this paper, we improve the existing work on inference attacks on genomic privacy. We mainly consider complex correlations in the genome by using an observable Markov model and recombination model between the haplotypes. We also utilize the phenotype information about the victims. We propose an efficient message passing algorithm to consider all aforementioned background information for the inference. We show that the proposed framework improves inference with significantly less information compared to existing work.
It has been reported in multiple cancers that a certain set of gene mutations tend not to occur concurrently in the same patient. This mutual exclusivity pattern hints at a functional relation and can help uncover cancer-driver alterations. We address the problem of discovering mutually exclusive mutation gene sets through mining negative association rules. Our proposed algorithm, MEMNAR, efficiently mines for negative association rules in patient mutation data and constructs mutually exclusive gene sets based on these extracted rules with high accuracy. We also define and detect more complex mutual exclusivity patterns that have not been addressed in earlier approaches. Evaluations on simulated datasets demonstrate that MEMNAR can discover mutually exclusive gene sets faster with improved accuracy compared to the state-of-the-art methods. When we apply MEMNAR on breast cancer, we identify several mutually exclusive gene sets that are biologically relevant and some of which have not been reported in the literature.
Missing values, irregularly collected samples, and multi-resolution signals commonly occur in multivariate time series data, making predictive tasks difficult. These challenges are especially prevalent in the healthcare domain, where patients' vital signs and electronic records are collected at different frequencies and have occasionally missing information due to the imperfections in equipment or patient circumstances. Researchers have handled each of these issues differently, often handling missing data through mean value imputation and then using sequence models over the multivariate signals while ignoring the different resolution of signals. We propose a unified model named Multi-resolution Flexible Irregular Time series Network (Multi-FIT). The building block for Multi-FIT is the FIT network. The FIT network creates an informative dense representation at each time step using signal information such as last observed value, time difference since the last observed time stamp and overall mean for the signal. Vertical FIT (FIT-V) is a variant of FIT which also models the relationship between different temporal signals while creating the informative dense representations for the signal. The multi-FIT model uses multiple FIT networks for sets of signals with different resolutions, further facilitating the construction of flexible representations. Our model has three main contributions: a.) it does not impute values but rather creates informative representations to provide flexibility to the model for creating task-specific representations b.) it models the relationship between different signals in the form of support signals c.) it models different resolutions in parallel before merging them for the final prediction task. The FIT, FIT-V and Multi-FIT networks improve upon the state-of-the-art models for three predictive tasks, including the forecasting of patient survival.
Space exploration is one of the most expensive and sophisticated industries in the world, usually funded by country governments. SpaceX, a private space exploration company, aims to reduce the cost of space exploration by landing rockets on the ground and reuse them instead of abandoning them into the atmosphere. However, achieving this goal was not easy for SpaceX. Their rockets crashed many times before the successful landing of Falcon 9. In this project, we try to improve a reinforcement learning algorithm based on a neural network to land a rocket in a simulated environment. Our aim is to reduce the number of SpaceX landing failures. To achieve this goal, we have studied many algorithms and came up with Proximal Policy Optimization (PPO) algorithm to attack this problem. PPO is an algorithm that learns the policy (i.e. the function that maps a state to action) and the value of a state by using an artificial neural network. In this project, we tried to modify the neural network part to achieve better results, also we tried to modify the PPO algorithm to adopt RNN to consider the previous steps in each episode which is a new idea.
You can see the steps of the training of our best model and how it performs in different stages of training in the following link: https://youtu.be/UaHEVTesnkk
Developed an algorithm in Matlab for the complex and novel problem of inferring missing SNPs in Genome based on belief propagation algorithm. Results of this project is submitted to IEEE/ACM Transactions on Computational Biology and Bioinformatics. (Link to paper)
Automatic Music Composer
Environment: C#, SQL server.
Keywords: Machine Learning, Hidden Markov Models
Developed a program in C# language which composes a music that stimulates the given feeling in humans like happiness, sadness and etc. using selected combination of instruments. The program First finds the music from a database we gathered, according to user selections of music feeling and instruments. Then it uses an edited Hidden Markov Model algorithm to learn the structures of these music and create a new music accordingly.
This project is a very simple simulation of world which consists of some creatures and food which distributed randomly. Every creature has its own mind which is a simple neural network and has random number of neurons and it can eat food or attempt to eat other creatures. The creatures learn to go after food and it shows how they interact with each other and their environment.
Iman Deznabi
iman at cs.umass.edu
Computer Science Department
University of Massachusetts Amherst
Amherst, MA 01003, US
Nov 23, 2024: I will present our work "Towards Resolution-Aware Retrieval Augmented Zero-Shot Forecasting" at Time Series in The Age of Large Models workshop NeurIPS (2024) as a spotlight presentation
June 13, 2024: Defended my thesis proposal titled "Adaptive Deep Learning Models for Personalized Modeling of Heterogeneous Time-series Data".