Projects per year
Abstract
The design and implementation of decision procedures for checking path feasibility in stringmanipulating programs is an important problem, with such applications as symbolic execution of programs with strings and automated detection of crosssite scripting (XSS) vulnerabilities in web applications. A (symbolic) path is given as a finite sequence of assignments and assertions (i.e. without loops), and checking its feasibility amounts to determining the existence of inputs that yield a successful execution. Modern programming languages (e.g. JavaScript, PHP, and Python) support many complex string operations, and strings are also often implicitly modified during a computation in some intricate fashion (e.g. by some autoescaping mechanisms).
In this paper we provide two general semantic conditions which together ensure the decidability of path feasibility: (1) each assertion admits regular monadic decomposition (i.e.\ is an effectively recognisable relation), and (2) each assignment uses a (possibly nondeterministic) function whose inverse relation preserves regularity. We show that the semantic conditions are expressive since they are satisfied by a multitude of string operations including concatenation, oneway and twoway finitestate transducers, replaceall functions (where the replacement string could contain variables), stringreverse functions, regularexpression matching, and some (restricted) forms of lettercounting/length functions. The semantic conditions also strictly subsume existing decidable string theories (e.g. straightline fragments, and acyclic logics), and most existing benchmarks (e.g. most of Kaluza's, and all of SLOG's, Stranger's, and SLOTH's benchmarks). Our semantic conditions also yield a conceptually simple decision procedure, as well as an extensible architecture of a string solver in that a user may easily incorporate his/her own string functions into the solver by simply providing code for the preimage computation without worrying about other parts of the solver. Despite these, the semantic conditions are unfortunately too general to provide a fast and complete decision procedure. We provide strong theoretical evidence for this in the form of complexity results. To rectify this problem, we propose two solutions. Our main solution is to allow only partial string functions (i.e., prohibit nondeterminism) in condition (2). This restriction is satisfied in many cases in practice, and yields decision procedures that are effective in both theory and practice. Whenever nondeterministic functions are still needed (e.g. the string function split), our second solution is to provide a syntactic fragment that provides a support of nondeterministic functions, and operations like oneway transducers, replaceall (with constant replacement string), the stringreverse function, concatenation, and regularexpression matching. We show that this fragment can be reduced to an existing solver SLOTH that exploits fast model checking algorithms like IC3.
We provide an efficient implementation of our decision procedure (assuming our first solution above, i.e., deterministic partial string functions) in a new string solver OSTRICH. Our implementation provides builtin support for concatenation, reverse, functional transducers (), and replaceall and provides a framework for extensibility to support further string functions. We demonstrate the efficacy of our new solver against other competitive solvers.
In this paper we provide two general semantic conditions which together ensure the decidability of path feasibility: (1) each assertion admits regular monadic decomposition (i.e.\ is an effectively recognisable relation), and (2) each assignment uses a (possibly nondeterministic) function whose inverse relation preserves regularity. We show that the semantic conditions are expressive since they are satisfied by a multitude of string operations including concatenation, oneway and twoway finitestate transducers, replaceall functions (where the replacement string could contain variables), stringreverse functions, regularexpression matching, and some (restricted) forms of lettercounting/length functions. The semantic conditions also strictly subsume existing decidable string theories (e.g. straightline fragments, and acyclic logics), and most existing benchmarks (e.g. most of Kaluza's, and all of SLOG's, Stranger's, and SLOTH's benchmarks). Our semantic conditions also yield a conceptually simple decision procedure, as well as an extensible architecture of a string solver in that a user may easily incorporate his/her own string functions into the solver by simply providing code for the preimage computation without worrying about other parts of the solver. Despite these, the semantic conditions are unfortunately too general to provide a fast and complete decision procedure. We provide strong theoretical evidence for this in the form of complexity results. To rectify this problem, we propose two solutions. Our main solution is to allow only partial string functions (i.e., prohibit nondeterminism) in condition (2). This restriction is satisfied in many cases in practice, and yields decision procedures that are effective in both theory and practice. Whenever nondeterministic functions are still needed (e.g. the string function split), our second solution is to provide a syntactic fragment that provides a support of nondeterministic functions, and operations like oneway transducers, replaceall (with constant replacement string), the stringreverse function, concatenation, and regularexpression matching. We show that this fragment can be reduced to an existing solver SLOTH that exploits fast model checking algorithms like IC3.
We provide an efficient implementation of our decision procedure (assuming our first solution above, i.e., deterministic partial string functions) in a new string solver OSTRICH. Our implementation provides builtin support for concatenation, reverse, functional transducers (), and replaceall and provides a framework for extensibility to support further string functions. We demonstrate the efficacy of our new solver against other competitive solvers.
Original language  English 

Pages  130 
Number of pages  30 
DOIs  
Publication status  Epub ahead of print  2 Jan 2019 
Event  Principles of Programming Languages  Cascais/Lisbon, Portugal Duration: 13 Jan 2019 → 19 Jan 2019 https://popl19.sigplan.org/ 
Conference
Conference  Principles of Programming Languages 

Abbreviated title  POPL 
Country/Territory  Portugal 
City  Cascais/Lisbon 
Period  13/01/19 → 19/01/19 
Internet address 
Keywords
 String Constraints, Transducers, ReplaceAll, Reverse, Decision Procedures, StraightLine Programs
Projects
 1 Finished

Verification of Concurrent and HigherOrder Recursive Programs
Eng & Phys Sci Res Council EPSRC
1/05/13 → 30/04/18
Project: Research