# Regular Expressions Quick Reference

Several DeltaWalker functionality areas—the file and the folder comparison filters as well as the Find/Replace dialog—leverage the power of regular expressions as a means of searching and matching text. Using regular expressions, you can express a diverse set of patterns and be very precise as to the exact text to be matched. Their wide acceptance and knowledge base coverage in the public domain makes them a preferred choice.

This section gives a brief introduction to the regular expression syntax. For additional pointers, please see the references listed in the See Also section below.

## Literals

All characters but the characters specified below are interpreted as themselves, and the explicitly mentioned characters are interpreted as themselves only when escaped with a backslash (`\`

) character placed right before them:

`\\.\[\]^$?\*+{}|()`

## Literal escapes

The following table lists and explains special uses of the backslash (`\`

) character in combination with other literals for the purpose of matching certain characters:

Construct | Matches |
---|---|

`\t` | The tab character |

`\n` | The newline (i.e. line-feed) character |

`\r` | The carriage-return character |

`\f` | The form-feed character |

`\a` | The bell (i.e. alert) character |

`\e` | The escape character |

`\0n` | The character with octal value 0n (`0 <= n <= 7` ) |

`\0nn` | The character with octal value 0nn (`0 <= n <= 7` ) |

`\0mnn` | The character with octal value 0mnn (`0 <= m <= 3, 0 <= n <= 7` ) |

`\xhh` | The character with hexadecimal value 0xhh |

`\uhhhh` | The character with hexadecimal value 0xhhhh |

`\cx` | The control character corresponding to x |

## Character classes

The dot (.) character matches any character. It's the simplest and the most widely used case of the so-called character classes—regular sub-expressions with simplified syntax matching sets of characters:

Construct | Matches |
---|---|

`[abc]` | a, b, or c (simple class) |

`[^abc]` | Any character except a, b, or c (negation) |

`[a-zA-Z]` | a through z or A through Z, inclusive (range) |

`[a-d[m-p]]` | a through d, or m through p: `[a-dm-p]` (union) |

`[a-z&&[def]]` | d, e, or f (intersection) |

`[a-z&&[^bc]]` | a through z, except for b and c: `[ad-z]` (subtraction) |

`[a-z&&[^m-p]]` | a through z, and not m through p: `[a-lq-z]` (subtraction) |

`\d` | A digit: `[0-9]` |

`\D` | A non-digit: `[^0-9]` |

`\s` | A whitespace character: `[ \t\n\x0B\f\r]` |

`\S` | A non-whitespace character: `[^\s]` |

`\w` | A word character: `[a-zA-Z_0-9]` |

`\W` | A non-word character: `[^\w]` |

## Boundary matchers

One of the special meanings of the `^`

character has already been demonstrated as part of the syntax to define negated character classes. Its second meaning, which is also in wide use, is to denote the beginning of a line i.e. it does not match an actual character but discovers where a line starts. Other expressions signaling boundaries are:

Construct | Matches |
---|---|

`$` | The end of a line |

`\b` | A word boundary |

`\B` | A non-word boundary |

`\A` | The beginning of the input |

`\G` | The end of the previous match |

`\Z` | The end of the input but for the final terminator, if any |

`\z` | The end of the input |

## Quantifiers

Quantifiers enable spelling out the notion of expressions that repeat their match a certain number of times. Expressions that are to match multiple times are suffixed by the quantifiers. The following table lists forms of quantified expressions that are often used:

Construct | Matches |
---|---|

`X?` | X, once or not at all |

`X*` | X, zero or more times |

`X+` | X, one or more times |

`X{n}` | X, exactly n times |

`X{n,}` | X, at least n times |

`X{n,m}` | X, at least n but not more than m times |

### Logical alternation

When matches at a given position are possible according to different expressions, the | character is used to separate the alternative expressions. For example, the scenario of matching according to either X or Y is expressed with the following form:

X|Y

### Groups

Parentheses group the elements of the regular expression into distinct sub-expressions so that quantifiers and logical alternation can be applied to them.