Researchers from the Penn State College of Information Sciences and Technology have developed a tool that could have a profound impact on the quality of everyday software, which could result in less service down time, fewer data breaches and better privacy protection.
Software engineers are often tasked with developing formal languages to match string patterns, called regular expressions, that a computer can understand. These are commonly used on the back end of search engines and in online forms to validate strong passwords that require a specific combination of letters, numbers and special characters.
While widely used for security purposes, regular expressions are complex to verify, according to Xiao Liu, lead researcher on the project and a doctoral student in the college.
To combat this challenge, the research team has developed a framework that allows software engineers to perform formal verification on regular expressions using natural language — such as words used in spoken conversation — instead of the more formal and complex languages typically used in computer programming.
“For example, if you want to enforce a regular expression to match digits only, you simply type in ‘this regular expression only matches digits,’” Liu explained. “Our framework will automatically translate the natural language requirement into formal specifications and check this property on a target regular expression.”
In their paper, “A Lightweight Framework for Regex Verification,” the researchers write that “because of the gradually increasing adoptions of agile software development, requirements and solutions evolve quickly, which causes the software development [to be] more error prone. In this situation, input validations using regular expressions are constantly being found inefficient due to errors in the constructed expressions, resulting in security vulnerabilities.”
According to the researchers, it is of critical importance to verify the correctness of regular expressions in software development because it enforces consistency between the current development phase of a piece of software and its initial requirements. The team’s framework makes this step less complicated and, in turn, more efficient than existing verification methods.
“Our framework will be especially useful for validating and debugging regular expressions in [these] systems,” said Liu.
Liu collaborated with former IST doctoral student Yufei Jiang and associate professor of IST Dinghao Wu on the project. For their work, the researchers won the best paper award at the 19th Institute of Electrical and Electronics Engineers International Symposium on High Assurance Systems Engineering, held Jan. 3-5 in Hangzhou, China.