This article extends our earlier WODA 2012 paper on RUGRAT. This article explores the computational resources RUGRAT requires, uses RUGRAT to benchmark Java source-to-bytecode compilers, and compares RUGRAT benchmarking results to a baseline of benchmarking with handwritten programs.
This is a superset of our earlier ISSTA 2012 paper on residual investigation. The article adds more experiments on applying residual investigation to FindBugs (RFBI), describes the implementation complexity of RFBI, and applies residual investigation in a new context, i.e., static race detection.
Some code can only be invoked and tested with instances of classes that don't yet exist. However state-of-the-art test case generators such as Randoop and Pex do not generate such classes and therefore cannot cover such code. This article extends our WODA 2010 paper on generating (mock) classes during dynamic symbolic execution. This article adds a survey of third-party applications and extends the approach to generating annotations. Our implementation in Dsc covered code that state-of-the-art tools could not cover.
While most heavy-weight symbolic program analysis tools are run before program release, this symbolic analysis framework runs while the analyzed program is in production use. Whenever a monitored program encounters an unexplored path, it submits the path to a central server for symbolic analysis. Analysis results are distributed back to other clients.
Dynamic symbolic execution has traditionally been used on assembly code (e.g., x86) as well as procedural (i.e., C) and object-oriented programs (i.e., Java and C#). SEDGE adapts dynamic symbolic execution to the dataflow programming language Pig Latin. While Pig Latin programs are typically compiled to Hadoop MapReduce programs, SEDGE analyzes dataflow programs directly. In our experiments this yielded better results than either analyzing the generated MapReduce programs or using the most closely related test case generator for Pig Latin.
For a given branching statement encountered during program execution, CarFast estimates the number of statements that are yet uncovered but reachable from the respective branch outcomes. With the symbolic path condition collected during execution, CarFast selects input values such that a future execution will trigger a branch (path) that contains a high number of those yet uncovered statements.
Static bug detectors such as FindBugs produce false warnings. This paper describes RFBI, the Residual FindBugs Investigator. RFBI investigates each FindBugs warning for code location A with a set of residual dynamic analyses at code locations B to Z, such that a dynamic warning at code location X provides additional evidence that the static warning at code location A is likely a true warning. Received an ACM SIGSOFT Distinguished Paper Award.
RUGRAT aims at generating random benchmark applications for evaluating program analysis and testing tools. The RUGRAT prototype can automatically generate large Java applications that consist of a user-specified mix of Java language features such as iteration, recursion, and the use of deep subtype hierarchies.
SimFuzz is a black-box fuzzer for C programs that guides its test case generation with a test case similarity metric. The metric computes the edit distance between execution paths, where each path element corresponds to the out-edge of a branching node in the program's control-flow graph.
We formalize a MapReduce-specific correctness condition that all MapReduce applications have to satisfy, in order to be free of a certain class of bugs. To detect such bugs, we then design a technique that encodes the correctness condition as symbolic program constraints, checks them via dynamic symbolic execution, and generates corresponding test cases.
Testing textbooks prescribe writing performance tests against performance goals. We observe that in practice business analysts may not be able to specify such performance goals at a level that is detailed enough for finding subtle performance bugs. We address this issue by running two different versions of the same application side-by-side in the same test environment, which allows us to use the performance profile of the previous version as the detailed performance specification of the version under test.
Software testing interacts with data anonymization in surprising ways. I.e., increasing data anonymity to protect data during testing can drastically decrease test coverage. One problem is that current anonymization techniques do not take into account how the application under test actually uses the data. We therefore propose to guide data anonymization techniques with program analysis. Received the Best Paper Award.
Dsc+Mock is a dynamic symbolic test case generator that can reason about type constraints and can generate mock classes that satisfy such constraints. Our prototype implementation achieved higher code coverage than related test case generators that do not generate mock classes, such as Pex.
This is a superset of our earlier ASE 2007 paper, expanding the treatment of test data generation. Object-Role Modeling (ORM) is a popular language for specifying database schemas. It supports many constraints and is undecidable in general. We pick a restricted subset of ORM that is decidable in polynomial time and implement a fast automated solver. We found that our ORM subset covers the vast majority of constraints used in our sample of over 160 ORM diagrams from industrial practice. Our earlier 2007 paper received the Best Paper Award at ASE 2007.
An existing static program analysis that over-approximates the execution paths of the analyzed program can be made more precise for automatic testing in an object-oriented programming language, by combining the over-approximating analysis with usage-observing and under-approximating analyses. This summarizes the DSD-Crasher, Check 'n' Crash, and JCrasher work. Unpublished material includes a critical review of the performed evaluation, lessons learnt, and how to generalize the approach.
This is a superset of our earlier ISSTA 2006 paper on DSD-Crasher, adding a high-level overview, experiments with subjects from the software-artifact infrastructure repository (SIR), more related work, and a discussion on increasing code coverage by reasoning about implicit control flow branches.
Object-Role Modeling (ORM) is a popular language for specifying database schemas. It supports many constraints and is undecidable in general. We pick a restricted subset of ORM that is decidable in polynomial time and implement a fast automated solver. We found that our ORM subset covers the vast majority of constraints used in our sample of over 160 ORM diagrams from industrial practice. Received the Best Paper Award.
This is an invited paper that reviews our bug finding tools: Check 'n' Crash addresses the language-level unsoundness of static bug finding tools whereas DSD-Crasher also addresses their user-level unsoundness. We use a small case study to compare JCrasher, ESC/Java, Check 'n' Crash, and DSD-Crasher.
DSD-Crasher first uses Daikon to capture the subject's intended execution behavior, then statically analyzes this restricted domain with ESC/Java, and finally lets Check 'n' Crash generate and execute concrete test-cases to verify the results of ESC/Java. Received an ACM SIGSOFT Distinguished Paper Award.
Check 'n' Crash uses ESC/Java to statically search for problems like null dereference, illegal type cast, or illegal array manipulation. Check 'n' Crash compiles ESC's results to JUnit test cases and executes them to filter out ESC's false positives.
JCrasher generates random test cases by chaining object constructors. It filters test case execution and presents only those that expose a bug or lack of robustness. It also enables JUnit to efficiently undo the changes a test case has done to testee class fields.
TouchDevelop represents a radically new mobile application development model, as TouchDevelop enables mobile application development on a mobile device. We describe a first experiment on independent, non-expert subjects to compare programmer productivity using TouchDevelop vs. using a more traditional approach to mobile application development.
We describe a combination of user- and kernel-mode malware that can subvert state-of-the-art dynamic malware analysis techniques, such as those built on the popular TEMU and Ether analysis frameworks. We present an alternative malware analysis framework, SEMU, which cannot be subverted by such attacks as it performs whole-system analysis outside the analyzed (guest) OS.
Dynamic malware analysis is hard as kernel-level malware may manipulate kernel data and thereby derail malware analysis. To address this problem we propose a kernel data duplication scheme that redirects malware to a copy of the kernel data and thus shields the kernel data used by all other applications from malicious manipulation.
This paper discusses the implementation of our dynamic symbolic data structure repair tool, DSDSR. We provide initial empirical results of applying DSDSR on different formulations of the same correctness condition and compare DSDSR with a state-of-the-art tool, Juzi.
We motivate how dynamic symbolic techniques enable generic repair to support a wider range of correctness conditions and present DSDSR, a novel repair algorithm based on dynamic symbolic execution. We implement the algorithm for Java and report initial empirical results to demonstrate the promise of our approach for generic repair.
We propose a two-pass algorithm to support interfaces and method overriding in dynamic invariant detection. The first pass associates a method call with the method executed and all methods it overrides up to and including the static receiver to derive the methods' preconditions. The second pass associates a method call with every supertype whose precondition is met to derive non-conflicting postconditions.