Recovering information from pixelized credentials

Garske, ViktorNoack, AndreasChristian Wressnegger, Delphine Reinhardt2023-01-242023-01-242022978-3-88579-717-3https://dl.gi.de/handle/20.500.12116/40150Pixelation is a common technique to redact sensitive information like credentials in images. In this paper, we propose a system that is able to recover information from pixelized text. Our contribution consists of a neural network as well as a generic pipeline that generates a realistic training dataset considering flexible specifications including wordlists, fonts, font sizes and letter spacings. The contributed neural network is a composition of a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN) using Long short-term memory (LSTM) and a Connectionist Temporal Classification (CTC) layer to decode sequences of characters. With our approach, we achieve a Label Error Rate (LER) under 50% when taking pixelation block sizes of up to 8 × 8 pixels on a 22pt font into account. Thereby, our results indicate that pixelation of sensitive data does not satisfy common privacy standards.enNeural networksprivacymachine learningcomputer visionpasswordcredentialspixelizedRecovering information from pixelized credentials10.18420/sicherheit2022_081617-5468