Student Success Prediction and the Trade-Off between Big Data and Data Minimization
ISSN der Zeitschrift
DeLFI 2018 - Die 16. E-Learning Fachtagung Informatik
Gesellschaft für Informatik e.V.
This paper explores student’s daily activity in a virtual learning environment in the anonymized Open University Learning Analytics Dataset (OULAD). We show that the daily activity of students can be used to predict their success, i.e. whether they pass or fail a course, with high accuracy. This is important since daily activity can be easily obtained and anonymized. To support this, we show that the binary information whether a student was active on a given day has similar predictive power as a combination of the exact number of clicks on the given day and sensitive private data like gender, disability, and highest educational level. We further show that the anonymized activity data can be used to group students. We identify different student types based on their daily binarized activity and outline how educators and system developers can utilize this to address different learning types. Our primary stakeholders are designers and developers of learning analytics systems as well as those who commission such systems. We discuss the privacy and design implications of our findings for data mining in educational contexts against the background of the principle of data minimization and the General Data Protection Regulation (GDPR) of the European Union.