P252 - Software Engineering 2016

https://dl.gi.de/handle/20.500.12116/19972

Auflistung nach:

1 - 4 von 4

Konferenzbeitrag
How reviewers think about internal and external validity in empirical software engineering
(Software Engineering 2016, 2016) Siegmund, Janet; Siegmund, Norbert; Apel, Sven
Empirical methods have grown common in software engineering, but there is no consensus on how to apply them properly. Is practical relevance key? Do in-ternally valid studies have any value? Should we replicate more to address the trade-off between internal and external validity? We asked the key players of software-engineering research, but they do not agree on answers to these questions. The original paper has been published at the International Conference on Software Engineering 2015 [SSA15]. Empirical research in software engineering came a long way. From being received as a niche science, the awareness of its importance has increased. In 2005, empirical studies were found in about 2% of papers of major venues and conferences, while in recent years, almost all papers of ICSE, ESEC/FSE, and EMSE reported some kind of empirical evaluation, as we found in a literature review. Thus, the amount of empirically investigated claims has increased considerably. With the rising awareness and usage of empirical studies, the question of where to go with empirical software-engineering research is also emerging. New programming languages, techniques, and paradigms, new tool support to improve debugging and testing, new visualizations to present information emerge almost daily, and claims regarding their merits need to be evaluated-otherwise, they remain claims. But, how should new approaches be evaluated? Do we want observations that we can fully explain, but with a limited generalizability, or do we want results that are applicable to a variety of circumstances, but where we cannot reliably explain underlying factors and relationships? In other words, do researchers focus on internal validity and control every aspect of the experiment setting, so that differences in the outcome can only be caused by the newly introduced technique? Or, do they focus on external validity and observe their technique in the wild, showing a real-world effect, but without knowing which factors actually caused the observed difference? This tradeoff between internal and external validity is inherent in empirical research. Due to the options' different objectives, we cannot choose both. Deciding for one of these options is not easy, and existing guidelines are too general to assist in making this decision. With our work, we want to raise the awareness of this problem: How should we address the tradeoff between internal or external validity? In the end, every time we are planning an experiment, we must ask ourselves: Do we ask the right questions? Do we want pure, ground research, or applied research with immediate practical relevance? Is there even a way to design studies such that we can answer both kinds of questions at the same time, or is there no way around replications (i.e., exactly repeated studies or studies that deviate from the original study design only in a few, well-selected factors) in software-engineering research? To understand how the key players of software-engineering research would address this problem, we conducted a survey among the program-committee members of the major software-engineering venues of the recent years [SSA15]. In essence, we found that there is no agreement and that the opinions of the key players differ considerably (illustrated in Fig. 1). Even worse, we also found a profound lack of awareness regarding the tradeoff between internal and external validity, such that one reviewer would reject a paper that maximizes internal validity, because it “[w]ould show no value at all to SE community”. When we asked about replication, many program-committee members admitted that we need more replication in software-engineering research, but also indicated that replications have a difficult stand. One reviewer even states that replications are “a good example of hunting for publications just for the sake of publishing. Come on.” If the key players cannot agree on how to address the tradeoff between internal and external validity (or even do not see this tradeoff), and admit that replication-a well-established technique in other disciplines-would have almost no success in software-engineering research, how should we move forward? In the original paper, we shed light on this question, give insights on the participants' responses, and make suggestions on how we can address the tradeoff between internal and external validity. References [SSA15] Janet Siegmund, Norbert Siegmund, and Sven Apel. Views on Internal and External Validity in Empirical Software Engineering. In Proc. Int'l Conf. Software Engineering (ICSE), pages 9-19. IEEE CS, 2015.
Konferenzbeitrag
Morpheus: variability-aware refactoring in the wild
(Software Engineering 2016, 2016) Liebig, Jörg; Apel, Sven; Janker, Andreas; Garbe, Florian; Oster, Sebastian
Today, many software systems are configurable with conditional compilation. Just like any software system, configurable systems need to be refactored during their evolution. The inherent variability of configurable systems induces an additional dimension of complexity that is not addressed properly by current academic and industrial refactoring engines. Even simple refactorings, such as RENAME IDENTIFIER, are not handled well by existing refactoring engines and may introduce errors in some variants of the configurable system to be refactored. To improve the state of the art, we propose a variability-aware refactoring approach that relies on a canonical variability representation and variability-aware analysis. The goal is to preserve the behavior of all variants of the configurable system, without compromising general applicability and scalability. To demonstrate practicality, we developed MORPHEUS, a sound variability-aware refactoring engine for C code with preprocessor directives. We applied MORPHEUS to three substantial real-world systems (Busybox, OpenSSL, and SQLite) showing that variability-aware refactoring is practical (i.e., scalable, sound, and complete) in the presence of conditional compilation.
Konferenzbeitrag
On facilitating reuse in multi-goal test-suite generation for software product lines
(Software Engineering 2016, 2016) Lochau, Malte; Bürdek, Johannes; Bauregger, Stefan; Holzer, Andreas; Rhein, Alexander Von; Apel, Sven; Beyer, Dirk
Software testing is still the most established and scalable quality-assurance technique in practice today. However, generating effective test suites remains computationally expensive, consisting of repetitive reachability analyses for multiple test goals according to a coverage criterion. This situation is even worse when it comes to testing of entire software product lines (SPL). An SPL consists of a family of similar program variants, thus testing an SPL requires a sufficient coverage of all derivable program variants. Instead of considering every product variant one-by-one, family-based approaches are variability-aware analysis techniques in that they systematically explore similarities among the different variants. Based on this principle, we propose a novel approach for automated product-line test-suite generation incorporating extensive reuse of reachability information among test cases derived for different test goals and/or program variants. The developed tool implementation is built on top of CPA/TIGER which is based on CPACHECKER. We further present experimental evaluation results, revealing a considerable increase in efficiency compared to existing test-case generation techniques. Software product line (SPL) engineering aims at developing families of similar, yet welldistinguished software products built upon a common core platform. The commonality and variability among the family members (product variants) of an SPL are specified as features. In this regard, a feature corresponds to user-configurable product characteristics within the problem domain, as well as implementation artifacts being automatically composeable into implementation variants. The resulting extensive reuse of common feature artifacts among product variants facilitates development efficiency as well as product quality compared to one-by-one variant development. However, for SPLs to become fully accepted in practice, software-quality assurance techniques have to become variabilityaware, too, in order to benefit from SPL reuse principles. In practice, systematic software testing constitutes the most elaborated and wide-spread assurance technique, being directly applicable to software systems at any level of abstraction. In addition, testing enables a controllable trade-off between effectiveness and efficiency. In particular, white-box test generation consists of (automatically) deriving input vectors for a program under test with respect to predefined test goals. The derivation of sufficiently large test suites is, therefore, guided by test selection metrics, e.g., structural coverage criteria like basic block coverage and condition coverage [Be13]. These criteria impose multiple test goals, thus requiring sets of test input vectors for their complete coverage [Be13]. In case of mission-/safety- 1 This is a summary of a full article on this topic that appeared in Proc. FASE 2015 [Bü15]. 2 This work was partially supported by the DFG (German Research Foundation) under the Priority Programme SPP1593: Design For Future - Managed Software Evolution. 3 TU Darmstadt, Germany 4 TU Wien, Austria 5 University of Passau, Germany critical systems, it is imperative, or even enforced by industrial standards to guarantee a particular degree of code coverage for every delivered product. Technically, automated test input generation requires expensive reachability analyses of the program state space. Symbolic model checking is promising approach for fully automated white-box test generation using counterexamples as test inputs [Be04]. Nevertheless, concerning large sets of complex test goals, scalability issues still obstruct efficient test case generation when being performed for every test goal in separate. This problem becomes even worse while generating test inputs for covering entire product line implementations. To avoid a variantby-variant (re-)generation of test cases potentially leading to many redundant generation runs, an SPL test-suite generation approach must enhance existing techniques. In [Bü15], we presented a novel technique for efficient white-box test-suite generation for multi-goal test coverage of product-line implementations. The approach systematically exploits reuse potentials among reachability analysis results by means of similarity among test cases (1) derived for different test goals [Be13], and/or (2) derived for different product variants [Ci11]. The combination of both techniques allows for an incremental, coveragedriven exploration of the state space of entire product lines under test implemented in C enriched with feature parameters. We implemented an SPL test-suite generator for arbitrary coverage criteria on top of the symbolic software model checker CPACHECKER [Be13]. We evaluated our technique considering sample SPL implementations of varying size. Our experiments revealed the applicability of the tool to real-world SPL implementations, as well as a remarkable gain in efficiency obtained from the reuse of reachability analysis results. compared to test suite generation approaches without systematic reuse. As a future work, we plan to improve reuse capabilities by applying multi-property model-checking techniques of CPACHECKER which allows for reachability analyses of multiple test goals in a single run. Literaturverzeichnis [Be04] Beyer, Dirk; Chlipala, Adam J.; Henzinger, Thomas A.; Jhala, Ranjit; Majumdar, Rupak: Generating Tests from Counterexamples. In: ICSE. pp. 326-335, 2004. [Be13] Beyer, Dirk; Holzer, Andreas; Tautschnig, Michael; Veith, Helmut: Information Reuse for Multi-goal Reachability Analyses. In: ESOP, pp. 472-491. Springer, 2013. [Bü15] Bürdek, Johannes; Lochau, Malte; Bauregger, Stefan; Holzer, Andreas; von Rhein, Alexander; Apel, Sven; Beyer, Dirk: Facilitating Reuse in Multi-Goal Test-Suite Generation for Software Product Lines. In: FASE. Springer, 2015. [Ci11] Cichos, Harald; Oster, Sebastian; Lochau, Malte; Schürr, Andy: Model-based Coverage- Driven Test Suite Generation for Software Product Lines. In: MoDELS. Springer, pp. 425-439, 2011.
Konferenzbeitrag
Performance-Influence Models
(Software Engineering 2016, 2016) Siegmund, Norbert; Grebhahn, Alexander; Apel, Sven; Kästner, Christian

Auflistung P252 - Software Engineering 2016 nach Autor:in "Apel, Sven"

Treffer pro Seite

Sortieroptionen