Bhoyar, Rahul RajkumarWang, XiaDuong-Trung, NghiaKiesler, NatalieSchulz, Sandra2024-10-212024-10-212024https://dl.gi.de/handle/20.500.12116/45046Searching appropriate experimental datasets for machine learning projects and reducing the need for one-on-one student-teacher consultations are both challenging. Despite over 50,000 different datasets available across multiple domains on websites like Kaggle, practitioners often need help locating the necessary datasets. Even with the aid of Kaggle’s API and web search functionalities, the search results are not organized meaningfully to a specific context. Recent developments in artificial intelligence (AI) and large language models (LLMs) provide new means of addressing these relevant issues, which were impossible before. This paper introduces KaggleGPT, an LLM- assisted conversational recommender system designed to streamline finding suitable datasets for students’ projects directly from the textual content. The core of KaggleGPT employs a comprehensive approach by integrating profile-based, expert-based, knowledge-based, and multi-criteria-based recommendation engines. Our vision is for educators and students using KaggleGPT to enhance the educational experience and make dataset discovery more efficient and user-friendly.enKaggleRecommender SystemPrompt-based RecommendationLarge Language ModelsDataset Discovery.KaggleGPT: Prompt-based Recommender System for Efficient Dataset DiscoveryText/Conference Paper10.18420/delfi2024-ws-30