Modeling Human Behavior using Machine Learning Algorithms

Asma Ahmad Farhan, University of Connecticut

Document Type Dissertation


This dissertation discusses three dimensions of human behavior prediction. In the first part, emergency detection using WiFi Access Point (AP) data is discussed. Despite much progress in emergency management, effective techniques for real-time tracking of emergency events are still lacking. We envision a promising direction to achieve real-time emergency tracking is through widely adopted smartphones. Specifically, we explore the first step in achieving this goal, i.e., locating emergency using smartphones. Our main contribution is a novel approach that locates emergencies by analyzing smartphone association events with access points (APs) in a WiFi network. This is motivated by the observation that human behavior and mobility pattern are significantly altered in the face of emergency, which is reflected in how their smartphones associate with the APs in the WiFi network. Preliminary evaluation using real data collected from a university campus network demonstrates the effectiveness of our approach.

The second part of the dissertation focuses on human thermal comfort prediction using various environmental and physiological data. Human thermal sensation is an important factor in creating a “comfortable” work or living environment. In this vein,identifying factors that directly influence the thermal sensation of individuals, especially senior citizens, and using them to predict human thermal comfort in real-time will have enormous societal benefit. In the second chapter, we develop a novel datadriven approach to identify the environmental and physiological features, and train a classifier that takes these as input, and outputs a corresponding thermal sensation class (i.e. “feeling cold”, “neutral” and “feeling warm”). Evaluation using a largescale publicly available dataset demonstrates that with thermal sensation threshold of α = 1, the accuracy of our approach is 82−86% when using Support Vector Machine (SVM) and random forest classifiers, which is 2.4 times more than that of the widely adopted Fanger’s model (which only achieves an accuracy of approximately 36%). In addition, our study indicates that three factors, a person’s age, outdoor temperature and outdoor humidity that are not included in Fanger’s model, play an important role in thermal comfort, which is a finding interesting in its own right.

The third part of the dissertation develops machine learning models for depression detection using smartphone sensing data. Depression is a serious health disorder. In the third chapter, we investigate the feasibility of depression screening using sensor data collected from smartphones. We extract various behavior features from smartphone sensing data and investigate the efficacy of various machine learning tools to predict clinical diagnoses and PHQ-9 scores (a quantitative tool for aiding depression screening in practice). Our study is the first that uses a dataset that includes clinical ground truth and the first that uses two dominant smartphone platforms (iPhone and Android) on a large scale. We find that behavioral data from smartphones can predict clinical depression with good accuracy. In addition, combining behavioral data and PHQ-9 scores can provide prediction accuracy exceeding each in isolation, indicating that behavioral data captures relevant features that are not reflected by PHQ-9 scores. We further develop a multi-feature regression model for PHQ-9 scores that achieves an improved accuracy compared to a model using a single feature.