Hartmann, JohannaHeuer, HendrikBreiter, AndreasHenning, Peter A.Striewe, MichaelWölfel, Matthias2022-08-232022-08-232022978-3-88579-716-6https://dl.gi.de/handle/20.500.12116/38825Unsupervised machine learning techniques are increasingly used to cluster students based on their activity in virtual learning environments. It is commonly assumed that clusters formed by click data merely represent the actions of users and do not allow inferring personal information about individual users. Based on an analysis of 18,660 students and 5.56 million data points from the Open University Learning Analytics Dataset, we show that clusters trained on "raw" click data are highly correlated with personal information like student success, course specifics, and student demographics. Our analysis demonstrates that these clusters allow conclusions about demographic variables like the previous education and the affluence of the residential area. Our investigation shows that apparently, objective click data can leak private attributes. The paper discusses the implications of this for the design of virtual learning environments, especially considering the legal requirements posed by the principle of data minimization of the EU GDPR.enlearning analyticsmachine learningalgorithmic biasclusteringstudent performanceData Leakage Through Click Data in Virtual Learning EnvironmentsText/Conference Paper10.18420/delfi2022-0251617-5468