Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - STAMP (Software Testing AMPlification)

Teaser

Leveraging advanced research in automatic test generation, STAMP aims at pushing automation in DevOps one step further through the novel concept of test amplification. The main objective of STAMP is to automatically transform existing test assets in order to detect regression...

Summary

Leveraging advanced research in automatic test generation, STAMP aims at pushing automation in DevOps one step further through the novel concept of test amplification. The main objective of STAMP is to automatically transform existing test assets in order to detect regression bugs before production and drive down the cost of software testing.

We address this ambitious objective through increased test automation and research and development activities articulated around the objectives presented below:
- Objective 1. Provide an approach to automatically amplify unit test cases when a change is introduced in a program.
- Objective 2. Provide an approach to automatically generate, deploy and test large numbers of system configurations.
- Objective 3. Provide an approach to automatically amplify, optimize and analyze production logs in order to retrieve test cases that verify code changes against real world conditions.
- Objective 4. Develop test amplification microservices that can fit the DevOps pipelines.
- Objective 5. Validate the relevance and effectiveness of amplification on 5 use cases.
- Objective 6. Disseminate and exploit the open source STAMP test amplification services.

Work performed

Objective 1 has been addressed with the following activities
- develop DSPot, a unit test amplification tool, to support sound scientific investigations and experiments with the STAMP use cases.
- develop Descartes to assess the quality of test suites. This a new tool that aims at spotting pseudo-tested methods automatically.
- develop a novel approach that fits in the continuous integration engine and that can automatically feedback information about flaky tests, in the issue tracker.

Objective 2 has been addressed with the following activities
The first milestone here consisted in refining the concept, in order to develop technology that fits the need of use cases and improves automatic testing. We decided to focus on the second case and develop technology to automatically explore the space of possible execution environments and to automatically deploy the application and run test cases in different environments. The second part of the work consisted in designing a domain specific language to model the space of possible environments, a constraint model to automatically synthesize different environment configurations and a Docker-based solution to automate the deployment and test runs. The whole solution is developed as part of the CAMP tool.

Objective 3 has been addressed with the following activities
The work here has been articulated around three main tasks: develop a sound, reusable benchmark of crash logs in order to experiment with STAMP tools and also serve the DevOps research community; develop a first version of the EvoCrash tool that uses a novel Guided Genetic Algorithm (GGA) to generate test cases that can reproduce crashes; explore additional sources of information to improve the effectiveness of EvoCrash.

Objective 4 has been addressed with the following activities.
- facilitate the application of the STAMP technology on use cases, and the integration of our tools in DevOps standard software factories.
- focus on software development and documentation of STAMP technologies.

Objective 5 has been addressed with the following activities:
- A first version of the KPIs was defined, from KP-01 to KPI-09, with a strategy and tools to measure them.
A second pass was done on the KPI definitions
- Validation roadmapping. We have defined the concepts and materials required to conduct a formal industrial validation of the STAMP results; We designed the evaluation roadmap, which has scheduled different activities and events within the project lifecycle, in alignment with the STAMP tool release.
- Use case providers have set experiments with the initial versions of the STAMP tools, on their use cases.

Objective 6 has been addressed with the following activities
- Set up a robust open source collaboration infrastructure
- Establish the project\'s visual identity
- Set up and maintain the public website and the private wiki
- Set up and implement the dissemination strategy
- Prepare an initial exploitation plan.
- Conduct a survey to understand the potential business adoption for STAMP outcomes was designed.

Final results

Results on Unit test amplification show
- fully automated amplification is feasible with real-world, open source software package
- automatically amplified unit tests have been integrated in the test suites of 4 open source projects
These results are under revision for the Journal on Empirical Software Engineering (EMSE).
A survey on test amplification is submitted to the Journal of Systems and Software.

Results on Test quality assessment
- the first ever analysis of thousands of pseudo-tested methods measures how pseduo-tested methods are different from the other covered methods
- a qualitative manual analysis of 525 pseudo-tested methods, involving developers, reveals that less than 40% of these methods are clearly worth of additional testing effort.
Submitted to the Journal on Empirical Software Engineering (EMSE), and tool demo accepted at ASE.
XWiki embedded Descartes in their continuous integration engine since 4 months.


Results on configuration tests amplification
- demonstrate the feasibility, with the ATOS CityGo and XWiki use-cases, of a domain specific language to specify the different dimensions of environment configurations
- automatically analyze these descriptions and generate variations of existing configurations
- automatically generate Docker images that implement these new configurations and run the integration test suites on these new configurations.


Runtime test amplification
- early version of Evocrash showed that it can replicate 41 (82%) of real-world crashes, 34 (89%) of which are useful reproductions for debugging purposes
- a benchmark of 200 crash cases
- refine the guidance of the algorithm using multi-objectivization
- add seeding strategies for EvoCrash

Integration and industrialization
- Initial versions of Maven and Gradle plugins for STAMP tools, to support the integration of the tools in continuous integrations pipelines.
- Visual reporting support
- Eclipse plugins for DSpot and Descartes
- Github service for the incremental run of Descartes on the occurrence of a pull request

Validation
- Refined definitions of KPIs to improve the alignment between the use case needs, the features of the STAMP tools and the availability of sound and automated metrics.
- Definition of the evaluation roadmap and framework.
- Online survey and physical workshop with software developers who are outside the STAMP consortium. At the moment, 56 responses have been collected and are being analized.

Dissemination and exploitation:
- 18 scientific publications in international conferences and journals, in the area of software engineering.
- 8 talks in open source, industrial conferences
- 7 invited talks in software companies
- An initial market analysis (D6.3) that provides market insights useful to understand the business conditions facing the outcome of the STAMP project.

Website & more info

More info: https://www.stamp-project.eu.