Logo des Repositoriums
 

An adaptive filter-framework for the quality improvement of open-source software analysis

dc.contributor.authorHannemann, Anna
dc.contributor.authorHackstein, Michael
dc.contributor.authorKlamma, Ralf
dc.contributor.authorJarke, Matthias
dc.contributor.editorKowalewski, Stefan
dc.contributor.editorRumpe, Bernhard
dc.date.accessioned2018-10-31T12:45:22Z
dc.date.available2018-10-31T12:45:22Z
dc.date.issued2013
dc.description.abstractKnowledge mining in Open-Source Software (OSS) brings a great benefit for software engineering (SE). The researchers discover, investigate, and even simulate the organization of development processes within open-source communities in order to understand the community-oriented organization and to transform its advantages into conventional SE projects. Despite a great number of different studies on OSS data, not much attention has been paid to the data filtering step so far. The noise within uncleaned data can lead to inaccurate conclusions for SE. A special challenge for data cleaning presents the variety of communicational and development infrastructures used by OSS projects. This paper presents an adaptive filter-framework supporting data cleaning and other preprocessing steps. The framework allows to combine filters in arbitrary order, defining which preprocessing steps should be performed. The filter-portfolio can by extended easily. A schema matching in case of cross-project analysis is available. Three filters - spam detection, quotation elimination and coreperiphery distinction - were implemented within the filter-framework. In the analysis of three large-scale OSS projects (BioJava, Biopython, BioPerl), the filtering led to a significant data modification and reduction. The results of text mining (sentiment analysis) and social network analysis on uncleaned and cleaned data differ significantly, confirming the importance of the data preprocessing step within OSS empirical studies.en
dc.identifier.isbn978-3-88579-607-7
dc.identifier.pissn1617-5468
dc.identifier.urihttps://dl.gi.de/handle/20.500.12116/17701
dc.language.isoen
dc.publisherGesellschaft für Informatik e.V.
dc.relation.ispartofSoftware Engineering 2013
dc.relation.ispartofseriesLecture Notes in Informatics (LNI) - Proceedings, Volume P-213
dc.titleAn adaptive filter-framework for the quality improvement of open-source software analysisen
dc.typeText/Conference Paper
gi.citation.endPage156
gi.citation.publisherPlaceBonn
gi.citation.startPage143
gi.conference.date26. Februar - 1. März 2013
gi.conference.locationAachen
gi.conference.sessiontitleRegular Research Papers

Dateien

Originalbündel
1 - 1 von 1
Lade...
Vorschaubild
Name:
143.pdf
Größe:
130.99 KB
Format:
Adobe Portable Document Format