Textdokument
In-Database Machine Learning: Gradient Descent and Tensor Algebra for Main Memory Database Systems
Volltext URI
Dokumententyp
Dateien
Zusatzinformation
Datum
2019
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Quelle
Verlag
Gesellschaft für Informatik, Bonn
Zusammenfassung
Machine learning tasks such as regression, clustering, and classification are typically performed outside of database systems using dedicated tools, necessitating the extraction, transfor-mation, and loading of data. We argue that database systems when extended to enable automatic differentiation, gradient descent, and tensor algebra are capable of solving machine learning tasks more efficiently by eliminating the need for costly data communication. We demonstrate our claim by implementing tensor algebra and stochastic gradient descent using lambda expressions for loss functions as a pipelined operator in a main memory database system. Our approach enables common machine learning tasks to be performed faster than by extended disk-based database systems or as well as dedicated tools by eliminating the time needed for data extraction. This work aims to incorporate gradient descent and tensor data types into database systems, allowing them to handle a wider range of computational tasks.