Auflistung nach Schlagwort "Random Forest"
1 - 5 von 5
Treffer pro Seite
Sortieroptionen
- KonferenzbeitragComparison of Classifiers for Eye-Tracking Data(INFORMATIK 2024, 2024) Landes, Jennifer; Köppl, Sonja; Klettke, MeikeThis paper delves into the initial stages of data analysis, focusing on the classification of eye-tracking data. Six machine learning algorithms, namely XGBoost, Random Forest, Naive Bayes, Logistic Regression, Gradient Boosting Machines, and Neural Networks, were employed to predict cheating behavior based on a dataset comprising records from 25 students. Their performance was evaluated using metrics such as accuracy, precision, recall, F1 score, confusion matrix, and feature importance. Results indicate that Random Forest and its optimized version exhibit balanced performance, making them promising candidates for cheating prediction. The overarching research project investigates academic misconduct in the realm of online assessments, seeking to comprehend the behaviors and methodologies involved. An eye-tracking experiment was conducted to gain deeper insights into the timing and mannerisms of students engaging in academic misconduct.
- TextdokumentMachine Learning Applied to the Clerical Task Management Problem in Master Data Management Systems(BTW 2019, 2019) Oberhofer, Martin; Bremer, Lars; Chkalova, MariyaClerical tasks are created if a duplicate detection algorithm detects some similarity of records but not enough to allow an auto-merge operation. Data stewards review clerical tasks and make a final non-match or match decision. In this paper we evaluate different machine learning algorithms regarding their accuracy to predict the correct action for a clerical task and execute that action automatically if the prediction has sufficient confidence. This approach reduces the amount of work for data stewards by factors of magnitude.
- KonferenzbeitragMachine Learning on the estimation of Leaf Area Index(42. GIL-Jahrestagung, Künstliche Intelligenz in der Agrar- und Ernährungswirtschaft, 2022) Afrasiabian, Yasamin; Mokhtari, Ali; Yu, KangThe Leaf Area Index (LAI) is an important indicator in agriculture that can be considered a reliable plant growth parameter. The objective of this study is to make use of two different machine learning algorithms including Support Vector Machine (SVM), and Random Forest (RF) to improve the estimation of leaf area index using multispectral, thermal, and hyperspectral data. The results showed that RF was the best model to improve the accuracy of the LAI estimation compared to the simple linear regression (previous study) and SVM (R2 = 0.91 for RF and R2 = 0.87 for SVM). To evaluate the effects of spectral portions on LAI estimation without calculating the spectral indices, (SI) we inputted each pair of spectral bands for training and testing both RF and SVM. It was found that the best correlation was lower compared to use SIs. However, R2 variations were more homogeneous across the whole spectrum, which suggests that even by using multispectral broadband bands in RF and SVM, a good correlation will be achieved.
- TextdokumentML-basierte Klassifizierung von E-Mails für die datenschutzkonforme Löschung und Archivierung(INFORMATIK 2022, 2022) Kunz,Thomas; Waldmann,UlrichE-Mails enthalten in der Regel personenbezogene Daten, die den datenschutzrechtlichen Löschvorgaben unterliegen. Eine angemessene Umsetzung der Löschvorgaben stellt jedoch die verantwortlichen Unternehmen vor eine große Herausforderung, zumal nach Erfüllung des Verarbeitungszwecks oftmals unterschiedliche (spezial-)rechtliche Aufbewahrungspflichten einer sofortigen Löschung entgegenstehen. Für die Einhaltung von Lösch- und Aufbewahrungspflichten ist es zunächst erforderlich, E-Mails, die diesen Verpflichtungen unterliegen (z.B. Rechnungen) zu identifizieren. Dieser Beitrag untersucht, inwieweit E-Mails mithilfe von maschinellem Lernen (ML) klassifiziert werden können. Für auf diese Weise klassifizierte E-Mails kann im nächsten Schritt entschieden werden, ob sie gemäß den Anforderungen der Datenschutz-Grundverordnung (DSGVO) gelöscht oder gemäß gesetzlicher Aufbewahrungsfristen länger aufbewahrt und archiviert werden müssen. Der Beitrag beschreibt zudem die Entwicklung eines Proof-of-Concept in Form eines Add-ons für Microsoft Outlook, das Nutzern erlaubt, die in ihren Postfächern enthaltenen E-Mails zu klassifizieren.
- TextdokumentUsing Machine Learning to Predict POI Occupancy to Reduce Overcrowding(INFORMATIK 2022, 2022) Bollenbach,Jessica; Neubig,Stefan; Hein,Andreas; Keller,Robert; Krcmar,HelmutDue to the rapid growth of the tourism industry, associated effects like overcrowding, overtourism, and increasing greenhouse gas emissions lead to unsustainable development. A prerequisite for avoiding those adverse effects is the prediction of occupancy. The present study elaborates on the applicability and performance of various prediction models by taking a case study of beach occupancy data in Scharbeutz, Germany. The case study compares different machine learning models once as supervised machine learning models and once as time series models with a persistence model. XGBoost and Random Forest as time series demonstrate the most accurate prediction, followed by the supervised XGBoost model. However, the short prediction span of time series models is a disadvantage for longer-term visitor management to avoid the explained unsustainable effects through steering measures, so depending on the use case, the XGBoost model is to be favoured.