BKF03 - Regular Expressions

Regular Expressions
Brian Ciccolo
• Definition
• Uses – within Aspen and beyond
• Matching
• Replacing
What’s a Regular Expression?
In computing, regular expressions, also referred to as regex
or regexp, provide a concise and flexible means for
matching strings of text, such as particular characters,
words, or patterns of characters. A regular expression is
written in a formal language that can be interpreted by a
regular expression processor, a program that either serves
as a parser generator or examines text and identifies parts
that match the provided specification.
Why Use a Regex?
• Validate data entry
Example: Verify the format of a date field is mm/dd/yyyy
• Find/replace on steroids
Example: Reformat phone numbers to (###) ###-####
Regex Use in Aspen
• Data validation
o Date, time field input
o Validation rules (new in 3.0 – see session TEC07)
• Find/replace on steroids
o System Log filter
o Field formatting
RegEx Examples Using Notepad++
Select this option for our examples
Select the proper Search Mode
Matching – The Basics
• Literals - plain old text
• Classes
a, b, or c
Any lowercase letter
Any lowercase or uppercase letter
Any digit, 0 through 9
Not a letter (could be a digit or punctuation)
Matching – Predefined Classes
Any character
Any digit: [0-9]
Any non-digit: [^0-9]
A whitespace character (space, tab, newline)
A non-whitespace character: [^\s]
A word character: [a-zA-Z_0-9]
A non-word character (i.e., punctuation): [^\w]
Matching – Quantifiers
Matches 0 or 1 time
(Not supported by Notepad++)
Matches 1 or more times
Matches 0 or more times
Matches at least n times but no more than m times
(Not supported by Notepad++)
Matching – Greedy vs. Lazy
• Quantifiers are “greedy” by default –
they match as many characters as possible
• Sometimes you want to match the fewest
characters possible – enter “lazy” quantifiers
Lazy Equivalent*
* Not supported by Notepad++
Replacing – Groups
• “Groups” in the regex can be used in the
replacement value
• Delimited with parentheses in the regex
• Identified with \n where n is the nth
group in the original expression
• \0 represents the entire match
(not supported in Notepad++)
Reformatting Dates
• Change mm/dd/yyyy to yyyy-mm-dd
• Regex: (\d+)/(\d+)/(\d+)
• Replacement: \3-\1-\2
Step 2 – pad the single digits!
• Regex: -(\d)([-"])
• Replacement: -0\1\2
Reformatting Phone Numbers (v1)
• Wrap the area code in parentheses
• Regex: "(\d\d\d)-
• Replacement: "(\1)
Ends with a space!
Reformatting Phone Numbers (v2)
• Strip punctuation (numbers only)
• Regex: \((\d+)\) (\d+)-(\d+)
• Replacement: \1\2\3
Reformatting Social Security Numbers
• Format SSN as ###-##-####
• Do it in Aspen!
• Define a record in the Regular Expression
Library table
• Set the regex on the Person ID field in the
Data Dictionary
Define a Regular Expression
Regex and format properties
Update the Data Dictionary
Link to the regex
Verify the Results
• Wikipedia Entry
• Regular Expressions Cheat Sheet (V2)
• Java regex support
• Notepad++ text editor and regex support
Thank you.
[email protected]

similar documents