Purdue researchers use AI to predict students’ locations and friends from Wi-Fi data
Location-based check-ins reveal a lot about a person — and college students in particular, as it turns out. Researchers at Purdue University published a paper (“Exploring Student Check-In Behavior for Improved Point-of-Interest Prediction“) on the preprint server Arxiv.org early last month describing how Wi-Fi access logs could be used to identify correlations between users, locations, and activities in an academic setting.
Predicting locations and friendships from location data with AI might sound a bit creepy, true. But on the plus side, it’s not as dystopian as AI that can predict personality traits from eye movements.
“In point-of-interest (POI) tasks, the goal is to use user behavioral data to model users’ activities at different locations and times, and then make predictions (or recommendations for relevant venues based on their current context,” the researchers wrote. “In this work, we present the first analysis of a spatio-temporal educational ‘check-in’ dataset, with the aim of using POI predictions to personalize student recommendations … and to understand behavior patterns that increase student retention and satisfaction. The results also provide a better idea of how campus facilities are utilized and how students connect with each other.”
The team noted that in most previous POI research, datasets consisted of largely voluntary check-ins from social network apps like Foursquare or Yelp. As a result, they were “rich” in information about, say, restaurants and entertainment hotspots, but didn’t shed much light on “prosaic” activities like arriving at an office, leaving home, or running an errand. Additionally, because the users who contributed to them often visited venues only once, they could have biased conclusions and made it difficult to identify consistent patterns.
The researchers chose to tackle the problem with Wi-Fi — Purdue University’s Wi-Fi. The advantage, they argued in the paper, was a “better temporal resolution” because of the sheer volume of per-user Wi-Fi access history data available. (Participating students in the study “checked in” whenever their device sent or received a packet wirelessly, contributing to a log file that eventually reached 376GB in size.) After pairing that data with venue information about locations, the paper’s authors were able to analyze the movements of all freshmen Purdue students throughout the academic year 2016-2017.
Each entry in the dataset contained four items: users, points of interest, points of interest functionality (e.g., residence or recreation), and time span (the amount of time spent in a given location). After processing, which involved removing users with fewer than 100 check-ins and other steps, the processed sample had 540 million logs.
It revealed a few interesting trends. For example, on weekdays, students visited the dining halls around 12 p.m. and 6 p.m., and went to the gym around 8 p.m. Predictably, freshmen students explored the campus pretty quickly (within the first 2-3 weeks) and then stuck to a fixed, familiar range of buildings over the remainder of the semester. And preferences varied by major. Computer science students and pharmacy students dined at the same time, but the latter group attended class more often between 11 a.m. to 12 p.m. CS students hit the books from morning to afternoon and spent more time in academic buildings, while pharmacy students hightailed it to the weight room at later times.
After additional processing and indexing, the researchers trained an array of machine learning models on the first 80 check-in records in chronological order, reserving the remaining 20 percent for testing. Their proposed AI system — embedding for dense heterogeneous graphs, or EDHG — managed to accurately predict the top three locations a student had visited with 85 and 31 percent accuracy, respectively, and the top ten with 90 percent and 71 percent accuracy.
Next, the authors of the paper set it loose on “covisitation events” — when two students are in the same place at the same time. They theorized that it could indicate relations — i.e. friendships — among people.
EDHG did well in this regard, suggesting a list of 10 potential friends for each user that outperformed state-of-the-art methods in baselines. The researchers noted, however, that recommendations for less active users — i.e., users with fewer check-ins — tended to be less accurate.
They left to future work to incorporate the covisitation data into the AI model, which they hope will show whether social interactions affect student check-in behavior.
“These initial results indicate the promise of using student trajectory information for personalized recommendations in education apps,” they wrote, “as well as in predictive models of student retention and satisfaction.”
Let’s hope future use cases are as innocuous as the researchers predict.