**What is decidable about string constraints with the ReplaceAll function.** / Chen, Taolue; Chen, Yan; Hague, Matthew; Lin, Anthony; Wu, Zhilin.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

E-pub ahead of print

**What is decidable about string constraints with the ReplaceAll function.** / Chen, Taolue; Chen, Yan; Hague, Matthew; Lin, Anthony; Wu, Zhilin.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Chen, T, Chen, Y, Hague, M, Lin, A & Wu, Z 2017, What is decidable about string constraints with the ReplaceAll function. in *Proceedings of the ACM on Programming Languages.* vol. 2, 3, ACM, pp. 1-29, Principles of Programming Languages, Los Angeles, United States, 10/01/18. https://doi.org/10.1145/3158091

Chen, T., Chen, Y., Hague, M., Lin, A., & Wu, Z. (2017). What is decidable about string constraints with the ReplaceAll function. In *Proceedings of the ACM on Programming Languages *(Vol. 2, pp. 1-29). [3] ACM. https://doi.org/10.1145/3158091

Chen T, Chen Y, Hague M, Lin A, Wu Z. What is decidable about string constraints with the ReplaceAll function. In Proceedings of the ACM on Programming Languages. Vol. 2. ACM. 2017. p. 1-29. 3 https://doi.org/10.1145/3158091

@inproceedings{3d43bd9f12e343518d3af83aceccef2d,

title = "What is decidable about string constraints with the ReplaceAll function",

abstract = "The theory of strings with concatenation has been widely argued as the basis of constraint solving for verifying string-manipulating programs. However, this theory is far from adequate for expressing many string constraints that are also needed in practice; for example, the use of regular constraints (pattern matching against a regular expression), and the string-replace function (replacing either the first occurrence or all occurrences of a ``pattern'' string constant/variable/regular expression by a ``replacement'' string constant/variable), among many others. Both regular constraints and the string-replace function are crucial for such applications as analysis of JavaScript (or more generally HTML5 applications) against cross-site scripting (XSS) vulnerabilities, which motivates us to consider a richer class of string constraints. The importance of the string-replace function (especially the replace-all facility) is increasingly recognised, which can be witnessed by the incorporation of the function in the input languages of several string constraint solvers. Recently, it was shown that any theory of strings containing the string-replace function (even the most restricted version where pattern/replacement strings are both constant strings) becomes undecidable if we do not impose some kind of straight-line (aka acyclicity) restriction on the formulas. Despite this, the straight-line restriction is still practically sensible since this condition is typically met by string constraints that are generated by symbolic execution. In this paper, we provide the first systematic study of straight-line string constraints with the string-replace function and the regular constraints as the basic operations. We show that a large class of such constraints (i.e. when only a constant string or a regular expression is permitted in the pattern) is decidable. We note that the string-replace function, even under this restriction, is sufficiently powerful for expressing the concatenation operator and much more (e.g. extensions of regular expressions with string variables). This gives us the most expressive decidable logic containing concatenation, replace, and regular constraints under the same umbrella. Our decision procedure for the straight-line fragment follows an automata-theoretic approach, and is modular in the sense that the string-replace terms are removed one by one to generate more and more regular constraints, which can then be discharged by the state-of-the-art string constraint solvers. We also show that this fragment is, in a way, a maximal decidable subclass of the straight-line fragment with string-replace and regular constraints. To this end, we show undecidability results for the following two extensions: (1) variables are permitted in the pattern parameter of the replace function, (2) length constraints are permitted. ",

keywords = "String Constraints, ReplaceAll, Decision Procedures, Constraint Solving, Straight-Line Programs",

author = "Taolue Chen and Yan Chen and Matthew Hague and Anthony Lin and Zhilin Wu",

year = "2017",

month = dec,

day = "27",

doi = "10.1145/3158091",

language = "English",

volume = "2",

pages = "1--29",

booktitle = "Proceedings of the ACM on Programming Languages",

publisher = "ACM",

note = "Principles of Programming Languages, POPL ; Conference date: 10-01-2018 Through 12-01-2018",

}

TY - GEN

T1 - What is decidable about string constraints with the ReplaceAll function

AU - Chen, Taolue

AU - Chen, Yan

AU - Hague, Matthew

AU - Lin, Anthony

AU - Wu, Zhilin

PY - 2017/12/27

Y1 - 2017/12/27

N2 - The theory of strings with concatenation has been widely argued as the basis of constraint solving for verifying string-manipulating programs. However, this theory is far from adequate for expressing many string constraints that are also needed in practice; for example, the use of regular constraints (pattern matching against a regular expression), and the string-replace function (replacing either the first occurrence or all occurrences of a ``pattern'' string constant/variable/regular expression by a ``replacement'' string constant/variable), among many others. Both regular constraints and the string-replace function are crucial for such applications as analysis of JavaScript (or more generally HTML5 applications) against cross-site scripting (XSS) vulnerabilities, which motivates us to consider a richer class of string constraints. The importance of the string-replace function (especially the replace-all facility) is increasingly recognised, which can be witnessed by the incorporation of the function in the input languages of several string constraint solvers. Recently, it was shown that any theory of strings containing the string-replace function (even the most restricted version where pattern/replacement strings are both constant strings) becomes undecidable if we do not impose some kind of straight-line (aka acyclicity) restriction on the formulas. Despite this, the straight-line restriction is still practically sensible since this condition is typically met by string constraints that are generated by symbolic execution. In this paper, we provide the first systematic study of straight-line string constraints with the string-replace function and the regular constraints as the basic operations. We show that a large class of such constraints (i.e. when only a constant string or a regular expression is permitted in the pattern) is decidable. We note that the string-replace function, even under this restriction, is sufficiently powerful for expressing the concatenation operator and much more (e.g. extensions of regular expressions with string variables). This gives us the most expressive decidable logic containing concatenation, replace, and regular constraints under the same umbrella. Our decision procedure for the straight-line fragment follows an automata-theoretic approach, and is modular in the sense that the string-replace terms are removed one by one to generate more and more regular constraints, which can then be discharged by the state-of-the-art string constraint solvers. We also show that this fragment is, in a way, a maximal decidable subclass of the straight-line fragment with string-replace and regular constraints. To this end, we show undecidability results for the following two extensions: (1) variables are permitted in the pattern parameter of the replace function, (2) length constraints are permitted.

AB - The theory of strings with concatenation has been widely argued as the basis of constraint solving for verifying string-manipulating programs. However, this theory is far from adequate for expressing many string constraints that are also needed in practice; for example, the use of regular constraints (pattern matching against a regular expression), and the string-replace function (replacing either the first occurrence or all occurrences of a ``pattern'' string constant/variable/regular expression by a ``replacement'' string constant/variable), among many others. Both regular constraints and the string-replace function are crucial for such applications as analysis of JavaScript (or more generally HTML5 applications) against cross-site scripting (XSS) vulnerabilities, which motivates us to consider a richer class of string constraints. The importance of the string-replace function (especially the replace-all facility) is increasingly recognised, which can be witnessed by the incorporation of the function in the input languages of several string constraint solvers. Recently, it was shown that any theory of strings containing the string-replace function (even the most restricted version where pattern/replacement strings are both constant strings) becomes undecidable if we do not impose some kind of straight-line (aka acyclicity) restriction on the formulas. Despite this, the straight-line restriction is still practically sensible since this condition is typically met by string constraints that are generated by symbolic execution. In this paper, we provide the first systematic study of straight-line string constraints with the string-replace function and the regular constraints as the basic operations. We show that a large class of such constraints (i.e. when only a constant string or a regular expression is permitted in the pattern) is decidable. We note that the string-replace function, even under this restriction, is sufficiently powerful for expressing the concatenation operator and much more (e.g. extensions of regular expressions with string variables). This gives us the most expressive decidable logic containing concatenation, replace, and regular constraints under the same umbrella. Our decision procedure for the straight-line fragment follows an automata-theoretic approach, and is modular in the sense that the string-replace terms are removed one by one to generate more and more regular constraints, which can then be discharged by the state-of-the-art string constraint solvers. We also show that this fragment is, in a way, a maximal decidable subclass of the straight-line fragment with string-replace and regular constraints. To this end, we show undecidability results for the following two extensions: (1) variables are permitted in the pattern parameter of the replace function, (2) length constraints are permitted.

KW - String Constraints, ReplaceAll, Decision Procedures, Constraint Solving, Straight-Line Programs

U2 - 10.1145/3158091

DO - 10.1145/3158091

M3 - Conference contribution

VL - 2

SP - 1

EP - 29

BT - Proceedings of the ACM on Programming Languages

PB - ACM

T2 - Principles of Programming Languages

Y2 - 10 January 2018 through 12 January 2018

ER -