What are regular expressions?
- Regular expressions are a powerful language for “filtering” and “matching” text patterns.
- Regular expressions are a sequence of characters that define a text pattern.
What are text patterns?
All subsets of numbers and strings are text patterns:
- “12345 Test-PLZ”
- “0176 / 35206462”
- “Yours sincerely”
What do I need regex for?
- Match pattern
If the pattern is present, then strict action X:
“12345 Test-PLZ”
- Find and replace patterns
If the pattern is available, replace it with pattern X:
“0176 / 35206462″
- Extract pattern
If the pattern is available, then place or save the pattern in another location (database, variable).
“Yours sincerely”
Specific application
Popular places where you can come across regex and where you can apply this knowledge:
- Web-Anwendungen
- Unix-Scripts
In the following it is assumed that you have a corresponding position and that you have already informed yourself how regular expressions are used in the environment of your choice.
It is best to test your expression “raw” first. You can use an online tool like regextester for this.
Single-element Regular Expressions
The simplest form of regex is a search pattern that only provides a single element as a hit. Unless you are looking for a specific element, such a one-element regular expression can be easily defined using a character class, for example. The following expression allows the digits “1”, “2”, “3”, “4”, “5”, “6” or “7” as a possible result:
[1234567]
Since the numbers follow one another directly in this case, the following simplified spelling would also be possible:
[1-7]
If the regular expression should be changed so that the number “4” is excluded from the search, you can also use the simpler variant with the minus sign:
[1-35-7]
Multi-element regular expressions
Even with a multi-element regular expression, you can work with character classes to enable a selection of different hits. For example, if the printout should include two elements for which different results are conceivable, simply string two corresponding character classes:
[1-7][a-c]
The first element, a number between “1” and “7”, is followed by one of the letters “a”, a “b” or a “c”. As already mentioned, the lower case is mandatory. Before you deal with modifiers at this point, you can include capital letters with the following small change in the expression:
[1-7][a-cA-C]