Christoph's picture

Christoph Csallner

Professor
Computer Science and Engineering Department, University of Texas at Arlington
Box 19015, Arlington, TX 76019-0015, USA
csallner@uta.edu, ERB 554 (office), ERB 513 (SERC lab)

Bio: Christoph Csallner is a Professor in the Computer Science and Engineering Department at the University of Texas at Arlington (UTA). Before joining UTA, Dr. Csallner worked for Google and Microsoft Research. He graduated with a Diplom-Informatiker degree from Universität Stuttgart, Germany, and with an M.S. and a Ph.D., both in Computer Science, from Georgia Tech.

Dr. Csallner has broad research interests in software engineering and related areas. Currently he is working on problems in program analysis, automated bug finding, and mobile software engineering. Dr. Csallner's work has received several best paper awards and distinguished paper awards, including at the IEEE International Symposium on Software Reliability Engineering (ISSRE) in 2010, at the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA) in 2006 and 2012, at the Program Protection and Reverse Engineering Workshop (PPREW) in 2014, and at the IEEE/ACM International Conference on Automated Software Engineering (ASE) in 2007 and 2015. Dr. Csallner's research has been funded in part by MathWorks and the National Science Foundation.

Service
Current: PeerJ CS, CheckMATE 2024, ICSE 2024.
Distinguished Referee / Distinguished Reviewer Awards: ASE 2019, TOSEM 2011--2012.
Board of Distinguished Reviewers: TOSEM.
Publications and citations
DBLP and Google Scholar.
Best Paper Awards: PPREW-4, ISSRE 2010, ASE 2007.
ACM SIGSOFT Distinguished Paper Awards: ASE 2015, ISSTA 2012, ISSTA 2006.
Funding
Alzheimer's Association; MathWorks; NSF; Texas National Security Network (coverage: Dallas Morning News, NBC DFW, WBAP/KLIF); W. W. Caruth, Jr. Fund.
Teaching
CSE 3311 Object-Oriented Software Engineering (most recently: Fall 2024)
CSE 6324 Advanced Topics in Software Engineering (most recently: Fall 2023)
Software
JCrasher, Check 'n' Crash, DSD-Crasher, Pex/DySy, Dsc, REMAUI/Pixel to App
Current students
Tennov Simanjuntak
Shovon Niverd Pereira
Sabrina Haque
Mohammad Rifat Arefin
B.S. / M.S. alumni
Kyle Henry (Senior Fit project prototype demo), W.E. Kevin Yanogo, Nagendra Prasad Kasaghatta Ramachandra (VCAT project demo: YouTube), Sümeyye Süslü, Shivangi Kulshrestha, Siva Natarajan Balasubramania, Adis Kovacevic, Asheq Hamid, Tuan Anh Nguyen.
Ph.D. alumni
Sohil L. Shrestha (now: Research Scientist at Meta)
Soumik Mohian (now: Senior Software Engineer at BNSF Railway)
Shafiul Azam Chowdhury (now: Senior Machine Learning Engineer at Meta)
Shabnam Aboughadareh (coverage: KERA, now: Software Engineer at a startup)
Tuan Anh Nguyen (now: Senior Staff Software Engineer at Google)
Mainul Islam (Ex-Software Engineer at Amazon, Google, Intel)
Ishtiaque Hussain (now: Senior Software Engineer in Test at Bloomberg)
Tuli Nivas (now: Software Engineering Architect at Salesforce)

Recent Work

Fast deterministic black-box context-free grammar inference. Proc. 46th IEEE/ACM International Conference on Software Engineering (ICSE), 2024. (GitHub, Docker)
TreeVada is the first black-box approach that quickly infers a high-quality context-free grammar from sample programs. TreeVada starts with a few given valid sample programs of some language L, where the language may be a programming language that is not widely used. To decide if a program is valid, TreeVada uses a parser (or compiler) for L as a black-box, as the parser may run on a remote machine and cannot be instrumented.
D2S2: Drag 'n' drop mobile app screen search. Proc. 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Demonstrations Track, 2023. (search engine, GitHub, YouTube)
D2S2 (http://pixeltoapp.com/D2S2) is the first interactive drag-and-drop search engine for mobile app screenshots. D2S2 encodes text queries and the location, size, and icon type of the icons a user dragged on D2S2's UI. In a preliminary experiment D2S2 produced significantly more relevant search results than Google Image Search.
ScoutSL: An open-source Simulink search engine. Proc. 26th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems (MODELS), Tools and Demonstrations Track, 2023. (search engine, survey responses, source code, YouTube)
ScoutSL (http://scoutsl.net) is a web-based Simulink search engine. Users can search via free-form Google-style keywords and via Simulink model and project metrics (such as models' cyclomatic complexity or the number of GitHub pull requests) in 100k open-source Simulink models from 18k GitHub and MATLAB Central projects.
EvoSL: A large open-source corpus of changes in Simulink models & projects. Proc. 26th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems (MODELS), 2023. (dataset, tool)
EvoSL is the first large redistributable corpus of open-source Simulink models that contains project change histories. EvoSL contains 924 Git repositories from GitHub with their 3k issues, 2k pull requests, 10k comments, and over 100k commits. On a EvoSL subset we replicate a recent Simulink model change study carried out on a closed-source industrial project.
Replicability study: Corpora for understanding Simulink models & projects. Proc. 17th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 2023. (dataset, tool)
To address reproducibility issues, we re-collect two corpora of open-source Simulink models and compare them with the largest known corpus (SLNET). On SLNET we replicate earlier studies and find, among others, that cyclomatic complexity of Simulink models does not seem to be strongly correlated with other model metrics.

Open-source Simulink Datasets

SLNET: A redistributable corpus of 3rd-party Simulink models. Proc. 19th International Conference on Mining Software Repositories (MSR), Data and Tool Showcase Track, 2022. (dataset, metrics tool, mining tool, YouTube)
SLNET contains 9,117 third-party Simulink models from 2,837 open-source projects from GitHub and MATLAB Central that are licensed for re-distribution.
A curated corpus of Simulink models for model-based empirical studies. Proc. 4th International Workshop on Software Engineering for Smart Cyber-Physical Systems (SEsCPS), 2018. (bib, doi, tool, slides)
This paper presents a corpus of over 1,000 freely available MathWorks Simulink models.

Mobile Software Engineering

PSDoodle: Fast app screen search via partial screen doodle. Proc. 9th IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft), 2022. (search engine, GitHub, YouTube: 90 seconds, 15 minutes)
Existing screen search tools are either keyword-based (Google Image Search) or require the user to produce a sketch of a complete screen. PSDoodle (http://pixeltoapp.com/PSDoodle) is the first tool to allow interactive screen search via interactive freehand sketching.
PSDoodle: Searching for app screens via interactive sketching. Proc. 9th IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft), Tool Demos and Mobile Apps Track, 2022. (tool, GitHub, YouTube: 90 seconds, 10 minutes)
This paper adds details on PSDoodle's architecture, deployment, and evaluation.
Doodle2App: Native app code by freehand UI sketching. Proc. 7th IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft), Tool Demos and Mobile Apps Track, 2020. (tool, slides)
We trained Doodle2App (http://pixeltoapp.com/doodle) on thousands of (manually created) freehand sketches of 20 common Android UI elements. The resulting website allows users to sketch an Android screen. The website then converts the sketched screen to Android code that is ready to compile and run on stock Android phones. In technical terms: We pre-trained a classifier on 2 million Google "Quick, Draw!" sketches and retrained the classifier on thousands of sketches collected via Amazon Mechanical Turk. In some sense the resulting tool is a modern version of SILK ("Sketching Interfaces Like Krazy").
SPEjs: A symbolic partial evaluator for JavaScript. Proc. 1st International Workshop on Advances in Mobile App Analysis (A-Mobile), 2018. (bib, doi, tool, slides)
The partial evaluator SPEjs implements a symbolic execution engine for JavaScript and uses Z3 to resolve the feasibility of program execution paths. SPEjs removes some dead branches that the state-of-the-art partial evaluator Prepack from Facebook so far cannot remove.
P2A: A tool for converting pixels to animated mobile application user interfaces. Proc. 5th IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft), 2018. (bib, doi, slides)
P2A takes as input a set of screen design bitmaps (e.g., screenshots of an Android or iPhone app) and converts them to native app code (i.e., for Android), complete with inter-screen transitions and in-screen animations. P2A is implemented on top of REMAUI.
Reverse engineering mobile application user interfaces with REMAUI. Proc. 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2015. (bib, doi, slides, award)
When developing a mobile app (e.g., for Android or iOS), a graphic designer typically designs the app's screens and hands them to a programmer, who manually recreates the screen designs in source code. REMAUI is the first technique for automating this process end-to-end, from design drawings or screenshots to working UI code that can be compiled and run on a mobile device. Received an ACM SIGSOFT Distinguished Paper Award.
GROPG: A graphical on-phone debugger. Proc. 35th ACM/IEEE International Conference on Software Engineering (ICSE), New Ideas and Emerging Results (NIER) track, 2013. (bib, doi, slides, poster)
GROPG is the first graphical on-phone debugger. Developers can use GROPG to debug Android phone applications directly on an Android phone.
An experiment in developing small mobile phone applications comparing on-phone to off-phone development. Proc. 1st International Workshop on User Evaluation for Software Engineering Researchers (USER), 2012. (bib, doi, talk, overview)
TouchDevelop represents a radically new mobile application development model, as TouchDevelop enables mobile application development on a mobile device. We describe a first experiment on independent, non-expert subjects to compare programmer productivity using TouchDevelop vs. using a more traditional approach to mobile application development.

Testing

SLGPT: Using transfer learning to directly generate Simulink model files and find bugs in the Simulink toolchain. Proc. 25th International Conference on Evaluation and Assessment in Software Engineering (EASE), Vision and Emerging Results Track, 2021. (tool, doi, talk)
Simulink model files are normally generated by the Simulink toolchain. SLGPT gathers Simulink model files from open-source repositories and a random model generator. SLGPT then uses these Simulink model files to adapt OpenAI's widely used GPT-2 language model to learn the structure of these Simulink model files.
DeepFuzzSL: Generating models with deep learning to find bugs in the Simulink toolchain. Proc. 2nd Workshop on Testing for Deep Learning and Deep Learning for Testing (DeepTest), 2020. (bib, tool, talk, slides)
When creating development tools for cyber-physical systems, a big challenge is that there are no full formal specifications for popular existing tools and languages such as Simulink. To address this challenge, DeepFuzzSL attempts to learn such a language specification from existing Simulink sample models, using a LSTM recurrent neural network.
SLEMI: Equivalence modulo input (EMI) based mutation of CPS models for finding compiler bugs in Simulink. Proc. 42nd ACM/IEEE International Conference on Software Engineering (ICSE), 2020. (bib, tool, talk)
SLEMI speeds up our earlier SLforge random Simulink-model generator and differential testing tool. Instead of generating each Simulink model from scratch, SLEMI can quickly mutate existing models. SLEMI has found 6 new confirmed Simulink bugs.
Demo: SLEMI: Finding Simulink compiler bugs through equivalence modulo input (EMI). Proc. 42nd ACM/IEEE International Conference on Software Engineering (ICSE), Demonstrations Track, 2020. (bib, tool, talk)
This is a tool demonstration of SLEMI. The original SLEMI work appears in ICSE 2020 as a technical paper. This paper also adds a new mutation technique that found one additional new confirmed bug in Simulink.
Automatically finding bugs in a commercial cyber-physical system development tool chain with SLforge. Proc. 40th ACM/IEEE International Conference on Software Engineering (ICSE), 2018. (bib, doi, tool, slides, extended_slides)
We build the first large collection of public MathWorks Simulink models. We use these models to guide our new random Simulink model generator SLforge, which also uses semi-formal Simulink tool specifications. SLforge found 8 new confirmed Simulink bugs.
Complementing machine learning classifiers via dynamic symbolic execution: “Human vs. bot generated” tweets. Proc. 6th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE), 2018. (bib, slides)
This paper argues that program analysis such as dynamic symbolic execution can be nicely integrated into an existing supervised machine learning pipeline, to automatically produce additional labeled training samples.
Poster: Testing web-based applications with the voice controlled accessibility and testing tool (VCAT). Proc. 40th ACM/IEEE International Conference on Software Engineering (ICSE), Poster track, 2018. (bib, doi, tool, poster)
VCAT allows a user to navigate a web page only via voice commands. VCAT then exports a voice command sequence as a test case for the web page. The VCAT prototype is a plug-in for a stock Chrome browser and generates test cases via Selenium.
Demo: Fuzzing cyber-physical system development environments with CyFuzz. 20th ACM International Conference on Hybrid Systems: Computation and Control, Demo track, 2017. (bib, tool)
This is a demonstration of our CyFuzz tool for finding bugs in cyber-physical system development environments, i.e., Simulink.
CyFuzz: A differential testing framework for cyber-physical systems development environments. Proc. 6th Workshop on Design, Modeling and Evaluation of Cyber Physical Systems (CyPhy), 2016. (bib, doi, tool, slides)
CyFuzz generates random cyber-physical system design models for the widely used MathWorks Simulink toolchain. CyFuzz compares simulation results under different Simulink configurations and has thereby independently reproduced a Simulink bug.
RUGRAT: Evaluating program analysis and testing tools and compilers with large generated random benchmark applications. Software---Practice & Experience, 2016. (bib, doi, tool)
This article extends our earlier WODA 2012 paper on RUGRAT. This article explores the computational resources RUGRAT requires, uses RUGRAT to benchmark Java source-to-bytecode compilers, and compares RUGRAT benchmarking results to a baseline of benchmarking with handwritten programs.
Residual investigation: Predictive and precise bug detection. ACM Transactions on Software Engineering and Methodology (TOSEM), 2014. (bib, doi, ACM pdf for free)
This is a superset of our earlier ISSTA 2012 paper on residual investigation. The article adds more experiments on applying residual investigation to FindBugs (RFBI), describes the implementation complexity of RFBI, and applies residual investigation in a new context, i.e., static race detection.
Generating test cases for programs that are coded against interfaces and annotations. ACM Transactions on Software Engineering and Methodology (TOSEM), 2014. (bib, doi, ACM pdf for free)
Some code can only be invoked and tested with instances of classes that don't yet exist. However state-of-the-art test case generators such as Randoop and Pex do not generate such classes and therefore cannot cover such code. This article extends our WODA 2010 paper on generating (mock) classes during dynamic symbolic execution. This article adds a survey of third-party applications and extends the approach to generating annotations. Our implementation in Dsc covered code that state-of-the-art tools could not cover.
A distributed framework for demand-driven software vulnerability detection. Journal of Systems and Software (JSS), 2014. (bib, doi)
While most heavy-weight symbolic program analysis tools are run before program release, this symbolic analysis framework runs while the analyzed program is in production use. Whenever a monitored program encounters an unexplored path, it submits the path to a central server for symbolic analysis. Analysis results are distributed back to other clients.
SEDGE: Symbolic example data generation for dataflow programs. Proc. 28nd IEEE/ACM International Conference on Automated Software Engineering (ASE), 2013. (bib, doi, tool)
Dynamic symbolic execution has traditionally been used on assembly code (e.g., x86) as well as procedural (i.e., C) and object-oriented programs (i.e., Java and C#). SEDGE adapts dynamic symbolic execution to the dataflow programming language Pig Latin. While Pig Latin programs are typically compiled to Hadoop MapReduce programs, SEDGE analyzes dataflow programs directly. In our experiments this yielded better results than either analyzing the generated MapReduce programs or using the most closely related test case generator for Pig Latin.
CarFast: Achieving higher statement coverage faster. Proc. 20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), 2012. (bib, doi, ACM pdf for free tool)
For a given branching statement encountered during program execution, CarFast estimates the number of statements that are yet uncovered but reachable from the respective branch outcomes. With the symbolic path condition collected during execution, CarFast selects input values such that a future execution will trigger a branch (path) that contains a high number of those yet uncovered statements.
Residual investigation: Predictive and precise bug detection. Proc. ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 2012. (bib, doi, ACM pdf for free, award)
Static bug detectors such as FindBugs produce false warnings. This paper describes RFBI, the Residual FindBugs Investigator. RFBI investigates each FindBugs warning for code location A with a set of residual dynamic analyses at code locations B to Z, such that a dynamic warning at code location X provides additional evidence that the static warning at code location A is likely a true warning. Received an ACM SIGSOFT Distinguished Paper Award.
Evaluating program analysis and testing tools with the RUGRAT random benchmark application generator. Proc. 10th International Workshop on Dynamic Analysis (WODA), 2012. (bib, doi, ACM pdf for free, tool, slides)
RUGRAT aims at generating random benchmark applications for evaluating program analysis and testing tools. The RUGRAT prototype can automatically generate large Java applications that consist of a user-specified mix of Java language features such as iteration, recursion, and the use of deep subtype hierarchies.
Simfuzz: Test case similarity directed deep fuzzing. Journal of Systems and Software (JSS), 2012. (bib, doi)
SimFuzz is a black-box fuzzer for C programs that guides its test case generation with a test case similarity metric. The metric computes the edit distance between execution paths, where each path element corresponds to the out-edge of a branching node in the program's control-flow graph.
New ideas track: Testing MapReduce-style programs. Proc. 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), New Ideas Track, 2011. (bib, doi, ACM pdf for free, slides, poster)
We formalize a MapReduce-specific correctness condition that all MapReduce applications have to satisfy, in order to be free of a certain class of bugs. To detect such bugs, we then design a technique that encodes the correctness condition as symbolic program constraints, checks them via dynamic symbolic execution, and generates corresponding test cases.
Managing performance testing with release certification and data correlation. 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE), Industry Track, 2011. (bib)
Testing textbooks prescribe writing performance tests against performance goals. We observe that in practice business analysts may not be able to specify such performance goals at a level that is detailed enough for finding subtle performance bugs. We address this issue by running two different versions of the same application side-by-side in the same test environment, which allows us to use the performance profile of the previous version as the detailed performance specification of the version under test.
A combinatorial approach to detecting buffer overflow vulnerabilities. Proc. 41st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2011. (bib, doi)
This paper describes the Tance tool, which found several new vulnerabilities in well-known open-source C programs.
Is data privacy always good for software testing? Proc. 21st IEEE International Symposium on Software Reliability Engineering (ISSRE), 2010. (bib, doi, award)
Software testing interacts with data anonymization in surprising ways. For example, increasing data anonymity to protect data during testing can drastically decrease test coverage. One problem is that current anonymization techniques do not take into account how the application under test actually uses the data. We therefore propose to guide data anonymization techniques with program analysis. Received the Best Paper Award.
Dsc+Mock: A test case + mock class generator in support of coding against interfaces. Proc. 8th International Workshop on Dynamic Analysis (WODA), 2010. (bib, doi, ACM pdf for free, tool)
Dsc+Mock is a dynamic symbolic test case generator that can reason about type constraints and can generate mock classes that satisfy such constraints. Our prototype implementation achieved higher code coverage than related test case generators that do not generate mock classes, such as Pex.
Dynamic symbolic database application testing. Proc. 3rd International Workshop on Testing Database Systems (DBTest), 2010. (bib, doi)
We use dynamic symbolic execution to obtain a program path-condition. We then use this path-condition as a database query.
Detecting vulnerabilities in C programs using trace-based testing. Proc. 40th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2010. (bib, doi)
This paper describes the SecTAC tool, which found several new vulnerabilities in well-known open-source C programs.
Scalable satisfiability checking and test data generation from modeling diagrams. Automated Software Engineering, 2009. (bib, doi)
This is a superset of our earlier ASE 2007 paper, expanding the treatment of test data generation. Object-Role Modeling (ORM) is a popular language for specifying database schemas. It supports many constraints and is undecidable in general. We pick a restricted subset of ORM that is decidable in polynomial time and implement a fast automated solver. We found that our ORM subset covers the vast majority of constraints used in our sample of over 160 ORM diagrams from industrial practice.
Combining over- and under-approximating program analyses for automatic software testing. PhD thesis, Georgia Tech, 2008. (bib, doi, tool)
An existing static program analysis that over-approximates the execution paths of the analyzed program can be made more precise for automatic testing in an object-oriented programming language, by combining the over-approximating analysis with usage-observing and under-approximating analyses. This summarizes the DSD-Crasher, Check 'n' Crash, and JCrasher work. Unpublished material includes a critical review of the performed evaluation, lessons learnt, and how to generalize the approach.
DSD-Crasher: A hybrid analysis tool for bug finding. ACM Transactions on Software Engineering and Methodology (TOSEM), 2008. (bib, doi, ACM pdf for free, tool)
This is a superset of our earlier ISSTA 2006 paper on DSD-Crasher, adding a high-level overview, experiments with subjects from the software-artifact infrastructure repository (SIR), more related work, and a discussion on increasing code coverage by reasoning about implicit control flow branches.
Scalable automatic test data generation from modeling diagrams. Proc. 22nd IEEE/ACM International Conference on Automated Software Engineering (ASE), 2007. (bib, doi, ACM pdf for free, award)
Object-Role Modeling (ORM) is a popular language for specifying database schemas. It supports many constraints and is undecidable in general. We pick a restricted subset of ORM that is decidable in polynomial time and implement a fast automated solver. We found that our ORM subset covers the vast majority of constraints used in our sample of over 160 ORM diagrams from industrial practice. Received the Best Paper Award.
Combining static and dynamic reasoning for bug detection. Proc. International Conference on Tests And Proofs (TAP), volume 4454 of LNCS, 2007. (bib, doi, tool)
This is an invited paper that reviews our bug finding tools: Check 'n' Crash addresses the language-level unsoundness of static bug finding tools whereas DSD-Crasher also addresses their user-level unsoundness. We use a small case study to compare JCrasher, ESC/Java, Check 'n' Crash, and DSD-Crasher.
DSD-Crasher: A hybrid analysis tool for bug finding. Proc. ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 2006. (bib, doi, ACM pdf for free, tool, slides, award)
DSD-Crasher first uses Daikon to capture the subject's intended execution behavior, then statically analyzes this restricted domain with ESC/Java, and finally lets Check 'n' Crash generate and execute concrete test-cases to verify the results of ESC/Java. Received an ACM SIGSOFT Distinguished Paper Award.
Check 'n' Crash: Combining static checking and testing. Proc. 27th ACM/IEEE International Conference on Software Engineering (ICSE), 2005. (bib, doi, ACM pdf for free tool, slides)
Check 'n' Crash uses ESC/Java to statically search for problems like null dereference, illegal type cast, or illegal array manipulation. Check 'n' Crash compiles ESC's results to JUnit test cases and executes them to filter out ESC's false positives.
JCrasher: An automatic robustness tester for Java. Software---Practice & Experience, 2004. (bib, doi, tool)
JCrasher generates random test cases by chaining object constructors. It filters test case execution and presents only those that expose a bug or lack of robustness. It also enables JUnit to efficiently undo the changes a test case has done to testee class fields.

Security

Inside the fight against malware attacks. The Conversation, 2017. (bib, http)
This article, edited by Jeff Inglis at The Conversation, is a newspaper-compatible introduction to malware analysis and Shabnam's dynamic malware analysis tool SEMU.
Detecting rootkits with the RAI runtime application inventory. Proc. 6th Workshop on Software Security, Protection, and Reverse Engineering (SSPREW), 2016. (bib, doi, ACM pdf for free, slides, overview)
RAI monitors which precise code binaries are running on which machines, without having to restart the monitored applications. This is challenging, since the shape of binaries frequently changes in memory at runtime (e.g., due to an ongoing malware attack) and legacy machines often do not have advanced hardware features such as TPM. TDOIM applies RAI to detect rootkits at runtime.
Mixed-mode malware and its analysis. Proc. 4th Program Protection and Reverse Engineering Workshop (PPREW), 2014. (bib, doi, ACM pdf for free, overview, award)
Mixed-mode malware performs interdependent user- and kernel-level actions. Analyzing such malware requires a whole-system analysis that operates completely outside the malware's domain. We describe several mixed-mode malware samples and our mixed-mode malware analysis tool SEMU. Received the Best Paper Award.
Poster: Automatic profiling of evasive mixed-mode malware with SEMU. 33rd IEEE Symposium on Security and Privacy (Oakland), Poster session, 2013. (bib)
We describe a combination of user- and kernel-mode malware that can subvert state-of-the-art dynamic malware analysis techniques, such as those built on the popular TEMU and Ether analysis frameworks. We present an alternative malware analysis framework, SEMU, which cannot be subverted by such attacks as it performs whole-system analysis outside the analyzed (guest) OS.
Dynamic analysis of evasive modular malware. 28th Annual Computer Security Applications Conference (ACSAC), Works In Progress (WiP) Track, 2012. (bib, slides)
Dynamic malware analysis is hard as kernel-level malware may manipulate kernel data and thereby derail malware analysis. To address this problem we propose a kernel data duplication scheme that redirects malware to a copy of the kernel data and thus shields the kernel data used by all other applications from malicious manipulation.

Reverse Engineering

Reverse engineering object-oriented applications into high-level domain models with Reoom. 39th IEEE/ACM International Conference on Software Engineering Companion (ICSE-C), Poster track, 2017. (bib, doi, poster)
This paper makes two observations about how programmers may be expressing domain concepts (i.e., high-level business concepts) in object-oriented code. The Reoom tool encodes these observations in a light-weight static analysis and on four subjects showed overall higher precision and recall than Womble, the most closely related tool.
DySy: Dynamic symbolic execution for invariant inference. Proc. 30th ACM/IEEE International Conference on Software Engineering (ICSE), 2008. (bib, doi, ACM pdf for free)
DySy uses the concolic execution system Pex to detect invariants in arbitrary .Net programs. DySy can derive much better targeted invariants than previous, template-based approaches, such as Daikon.
Dynamically discovering likely interface invariants. Proc. 28th ACM/IEEE International Conference on Software Engineering (ICSE), Emerging Results Track, 2006. (bib, doi, ACM pdf for free)
We propose a two-pass algorithm to support interfaces and method overriding in dynamic invariant detection. The first pass associates a method call with the method executed and all methods it overrides up to and including the static receiver to derive the methods' preconditions. The second pass associates a method call with every supertype whose precondition is met to derive non-conflicting postconditions.

Repair

DSDSR: A tool that uses dynamic symbolic execution for data structure repair. Proc. 8th International Workshop on Dynamic Analysis (WODA), 2010. (bib, doi, ACM pdf for free)
This paper discusses the implementation of our dynamic symbolic data structure repair tool, DSDSR. We provide initial empirical results of applying DSDSR on different formulations of the same correctness condition and compare DSDSR with a state-of-the-art tool, Juzi.
Dynamic symbolic data structure repair. Proc. 32nd ACM/IEEE International Conference on Software Engineering (ICSE), Volume 2, Emerging Results Track, 2010. (bib, doi, ACM pdf for free, slides)
We motivate how dynamic symbolic techniques enable generic repair to support a wider range of correctness conditions and present DSDSR, a novel repair algorithm based on dynamic symbolic execution. We implement the algorithm for Java and report initial empirical results to demonstrate the promise of our approach for generic repair.

Information Visualization

FundExplorer: Supporting the diversification of mutual fund portfolios using Context Treemaps. Proc. 9th IEEE Symposium on Information Visualization (InfoVis), 2003. (bib, doi, mpeg)
FundExplorer distorts a treemap to visualize positive values and zeros.

Career Advice

This Website was updated on 12 August 2024 with bibtex2html.