learnxinyminutes-docs/pcre.md

---
name: PCRE
filename: pcre.txt
contributors:
    - ["Sachin Divekar", "http://github.com/ssd532"]

---

A regular expression (regex or regexp for short) is a special text string for describing a search pattern. e.g. to extract the protocol from a url string we can say `/^[a-z]+:/` and it will match `http:` from `http://github.com/`.

PCRE (Perl Compatible Regular Expressions) is a C library implementing regex. It was written in 1997 when Perl was the de-facto choice for complex text processing tasks. The syntax for patterns used in PCRE closely resembles Perl. PCRE syntax is being used in many big projects including PHP, Apache, R to name a few.


There are two different sets of metacharacters:

* Those that are recognized anywhere in the pattern except within square brackets

```
  \      general escape character with several uses
  ^      assert start of string (or line, in multiline mode)
  $      assert end of string (or line, in multiline mode)
  .      match any character except newline (by default)
  [      start character class definition
  |      start of alternative branch
  (      start subpattern
  )      end subpattern
  ?      extends the meaning of (
         also 0 or 1 quantifier
         also quantifier minimizer
  *      0 or more quantifier
  +      1 or more quantifier
         also "possessive quantifier"
  {      start min/max quantifier
```

* Those that are recognized within square brackets. Outside square brackets. They are also called as character classes.

```
  \      general escape character
  ^      negate the class, but only if the first character
  -      indicates character range
  [      POSIX character class (only if followed by POSIX syntax)
  ]      terminates the character class
```

PCRE provides some generic character types, also called as character classes.

```
  \d     any decimal digit
  \D     any character that is not a decimal digit
  \h     any horizontal white space character
  \H     any character that is not a horizontal white space character
  \s     any white space character
  \S     any character that is not a white space character
  \v     any vertical white space character
  \V     any character that is not a vertical white space character
  \w     any "word" character
  \W     any "non-word" character
```

## Examples

We will test our examples on the following string:

```
66.249.64.13 - - [18/Sep/2004:11:07:48 +1000] "GET /robots.txt HTTP/1.0" 200 468 "-" "Googlebot/2.1"
```

 It is a standard Apache access log.

| Regex | Result          | Comment |
| :---- | :-------------- | :------ |
| `GET`   | GET | GET matches the characters GET literally (case sensitive) |
| `\d+.\d+.\d+.\d+` | 66.249.64.13 | `\d+` match a digit [0-9] one or more times defined by `+` quantifier, `\.` matches `.` literally |
| `(\d+\.){3}\d+` | 66.249.64.13 | `(\d+\.){3}` is trying to match group (`\d+\.`) exactly three times. |
| `\[.+\]` | [18/Sep/2004:11:07:48 +1000] | `.+` matches any character (except newline), `.` is any character |
| `^\S+` | 66.249.64.13 | `^` means start of the line, `\S+` matches any number of non-space characters |
| `\+[0-9]+` | +1000 | `\+` matches the character `+` literally. `[0-9]` character class means single number. Same can be achieved using `\+\d+` |

## Further Reading
[Regex101](https://regex101.com/) - Regular Expression tester and debugger
Add an example of trap command (#1826) * Begin writing document for PCRE Started writing learnxinyminutes document for PCRE to cover general purpose regular expressions. Added introduction and a couple of details. * Change introductory example for regex The old example was incorrect. It's replaced with a simple one. * Add some more introductory text * Add first example * Added more example and a table for proper formatting * Add few more examples * Formatting * Improve example * Edit description of character classes * Add a way to test regex Add https://regex101.com/ web application to test the regex provided in example. * Add example of trap command trap is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using trap command i.e. cleanup upon receiving signal is added. * Revert "Add example of trap command" * Add an example of trap command `trap` is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using `trap` command i.e. cleanup upon receiving signal is added. 2016-06-26 12:38:05 +00:00			`---`
Unify language:/tool:/framework:/name: as name: 2024-12-09 11:34:00 +00:00			`name: PCRE`
Add an example of trap command (#1826) * Begin writing document for PCRE Started writing learnxinyminutes document for PCRE to cover general purpose regular expressions. Added introduction and a couple of details. * Change introductory example for regex The old example was incorrect. It's replaced with a simple one. * Add some more introductory text * Add first example * Added more example and a table for proper formatting * Add few more examples * Formatting * Improve example * Edit description of character classes * Add a way to test regex Add https://regex101.com/ web application to test the regex provided in example. * Add example of trap command trap is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using trap command i.e. cleanup upon receiving signal is added. * Revert "Add example of trap command" * Add an example of trap command `trap` is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using `trap` command i.e. cleanup upon receiving signal is added. 2016-06-26 12:38:05 +00:00			`filename: pcre.txt`
			`contributors:`
			`- ["Sachin Divekar", "http://github.com/ssd532"]`
Tweak markdown to properly render html 2019-09-20 04:12:41 +00:00
Add an example of trap command (#1826) * Begin writing document for PCRE Started writing learnxinyminutes document for PCRE to cover general purpose regular expressions. Added introduction and a couple of details. * Change introductory example for regex The old example was incorrect. It's replaced with a simple one. * Add some more introductory text * Add first example * Added more example and a table for proper formatting * Add few more examples * Formatting * Improve example * Edit description of character classes * Add a way to test regex Add https://regex101.com/ web application to test the regex provided in example. * Add example of trap command trap is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using trap command i.e. cleanup upon receiving signal is added. * Revert "Add example of trap command" * Add an example of trap command `trap` is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using `trap` command i.e. cleanup upon receiving signal is added. 2016-06-26 12:38:05 +00:00			`---`

[pcre/en] fix domain name => protocol (#4902) 2024-04-20 06:05:39 +00:00			A regular expression (regex or regexp for short) is a special text string for describing a search pattern. e.g. to extract the protocol from a url string we can say `/^[a-z]+:/` and it will match `http:` from `http://github.com/`.
Add an example of trap command (#1826) * Begin writing document for PCRE Started writing learnxinyminutes document for PCRE to cover general purpose regular expressions. Added introduction and a couple of details. * Change introductory example for regex The old example was incorrect. It's replaced with a simple one. * Add some more introductory text * Add first example * Added more example and a table for proper formatting * Add few more examples * Formatting * Improve example * Edit description of character classes * Add a way to test regex Add https://regex101.com/ web application to test the regex provided in example. * Add example of trap command trap is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using trap command i.e. cleanup upon receiving signal is added. * Revert "Add example of trap command" * Add an example of trap command `trap` is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using `trap` command i.e. cleanup upon receiving signal is added. 2016-06-26 12:38:05 +00:00
			`PCRE (Perl Compatible Regular Expressions) is a C library implementing regex. It was written in 1997 when Perl was the de-facto choice for complex text processing tasks. The syntax for patterns used in PCRE closely resembles Perl. PCRE syntax is being used in many big projects including PHP, Apache, R to name a few.`


			`There are two different sets of metacharacters:`
Tweak markdown to properly render html 2019-09-20 04:12:41 +00:00
Add an example of trap command (#1826) * Begin writing document for PCRE Started writing learnxinyminutes document for PCRE to cover general purpose regular expressions. Added introduction and a couple of details. * Change introductory example for regex The old example was incorrect. It's replaced with a simple one. * Add some more introductory text * Add first example * Added more example and a table for proper formatting * Add few more examples * Formatting * Improve example * Edit description of character classes * Add a way to test regex Add https://regex101.com/ web application to test the regex provided in example. * Add example of trap command trap is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using trap command i.e. cleanup upon receiving signal is added. * Revert "Add example of trap command" * Add an example of trap command `trap` is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using `trap` command i.e. cleanup upon receiving signal is added. 2016-06-26 12:38:05 +00:00			`* Those that are recognized anywhere in the pattern except within square brackets`
Tweak markdown to properly render html 2019-09-20 04:12:41 +00:00
Add an example of trap command (#1826) * Begin writing document for PCRE Started writing learnxinyminutes document for PCRE to cover general purpose regular expressions. Added introduction and a couple of details. * Change introductory example for regex The old example was incorrect. It's replaced with a simple one. * Add some more introductory text * Add first example * Added more example and a table for proper formatting * Add few more examples * Formatting * Improve example * Edit description of character classes * Add a way to test regex Add https://regex101.com/ web application to test the regex provided in example. * Add example of trap command trap is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using trap command i.e. cleanup upon receiving signal is added. * Revert "Add example of trap command" * Add an example of trap command `trap` is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using `trap` command i.e. cleanup upon receiving signal is added. 2016-06-26 12:38:05 +00:00			```
			`\ general escape character with several uses`
			`^ assert start of string (or line, in multiline mode)`
			`$ assert end of string (or line, in multiline mode)`
			`. match any character except newline (by default)`
			`[ start character class definition`
			`\| start of alternative branch`
			`( start subpattern`
			`) end subpattern`
			`? extends the meaning of (`
			`also 0 or 1 quantifier`
			`also quantifier minimizer`
			`* 0 or more quantifier`
			`+ 1 or more quantifier`
			`also "possessive quantifier"`
			`{ start min/max quantifier`
			```

			`* Those that are recognized within square brackets. Outside square brackets. They are also called as character classes.`
Tweak markdown to properly render html 2019-09-20 04:12:41 +00:00
Add an example of trap command (#1826) * Begin writing document for PCRE Started writing learnxinyminutes document for PCRE to cover general purpose regular expressions. Added introduction and a couple of details. * Change introductory example for regex The old example was incorrect. It's replaced with a simple one. * Add some more introductory text * Add first example * Added more example and a table for proper formatting * Add few more examples * Formatting * Improve example * Edit description of character classes * Add a way to test regex Add https://regex101.com/ web application to test the regex provided in example. * Add example of trap command trap is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using trap command i.e. cleanup upon receiving signal is added. * Revert "Add example of trap command" * Add an example of trap command `trap` is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using `trap` command i.e. cleanup upon receiving signal is added. 2016-06-26 12:38:05 +00:00			```
			`\ general escape character`
			`^ negate the class, but only if the first character`
			`- indicates character range`
			`[ POSIX character class (only if followed by POSIX syntax)`
			`] terminates the character class`
Tweak markdown to properly render html 2019-09-20 04:12:41 +00:00			```

			`PCRE provides some generic character types, also called as character classes.`

Add an example of trap command (#1826) * Begin writing document for PCRE Started writing learnxinyminutes document for PCRE to cover general purpose regular expressions. Added introduction and a couple of details. * Change introductory example for regex The old example was incorrect. It's replaced with a simple one. * Add some more introductory text * Add first example * Added more example and a table for proper formatting * Add few more examples * Formatting * Improve example * Edit description of character classes * Add a way to test regex Add https://regex101.com/ web application to test the regex provided in example. * Add example of trap command trap is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using trap command i.e. cleanup upon receiving signal is added. * Revert "Add example of trap command" * Add an example of trap command `trap` is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using `trap` command i.e. cleanup upon receiving signal is added. 2016-06-26 12:38:05 +00:00			```
			`\d any decimal digit`
			`\D any character that is not a decimal digit`
			`\h any horizontal white space character`
			`\H any character that is not a horizontal white space character`
			`\s any white space character`
			`\S any character that is not a white space character`
			`\v any vertical white space character`
			`\V any character that is not a vertical white space character`
			`\w any "word" character`
			`\W any "non-word" character`
			```

			`## Examples`

Tweak markdown to properly render html 2019-09-20 04:12:41 +00:00			`We will test our examples on the following string:`

			```
			`66.249.64.13 - - [18/Sep/2004:11:07:48 +1000] "GET /robots.txt HTTP/1.0" 200 468 "-" "Googlebot/2.1"`
			```

			`It is a standard Apache access log.`
Add an example of trap command (#1826) * Begin writing document for PCRE Started writing learnxinyminutes document for PCRE to cover general purpose regular expressions. Added introduction and a couple of details. * Change introductory example for regex The old example was incorrect. It's replaced with a simple one. * Add some more introductory text * Add first example * Added more example and a table for proper formatting * Add few more examples * Formatting * Improve example * Edit description of character classes * Add a way to test regex Add https://regex101.com/ web application to test the regex provided in example. * Add example of trap command trap is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using trap command i.e. cleanup upon receiving signal is added. * Revert "Add example of trap command" * Add an example of trap command `trap` is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using `trap` command i.e. cleanup upon receiving signal is added. 2016-06-26 12:38:05 +00:00
			`\| Regex \| Result \| Comment \|`
			`\| :---- \| :-------------- \| :------ \|`
[pcre] Fix examples, closes #3226 2018-10-23 21:03:51 +00:00			\| `GET` \| GET \| GET matches the characters GET literally (case sensitive) \|
			\| `\d+.\d+.\d+.\d+` \| 66.249.64.13 \| `\d+` match a digit [0-9] one or more times defined by `+` quantifier, `\.` matches `.` literally \|
			\| `(\d+\.){3}\d+` \| 66.249.64.13 \| `(\d+\.){3}` is trying to match group (`\d+\.`) exactly three times. \|
			\| `\[.+\]` \| [18/Sep/2004:11:07:48 +1000] \| `.+` matches any character (except newline), `.` is any character \|
			\| `^\S+` \| 66.249.64.13 \| `^` means start of the line, `\S+` matches any number of non-space characters \|
			\| `\+[0-9]+` \| +1000 \| `\+` matches the character `+` literally. `[0-9]` character class means single number. Same can be achieved using `\+\d+` \|
Add an example of trap command (#1826) * Begin writing document for PCRE Started writing learnxinyminutes document for PCRE to cover general purpose regular expressions. Added introduction and a couple of details. * Change introductory example for regex The old example was incorrect. It's replaced with a simple one. * Add some more introductory text * Add first example * Added more example and a table for proper formatting * Add few more examples * Formatting * Improve example * Edit description of character classes * Add a way to test regex Add https://regex101.com/ web application to test the regex provided in example. * Add example of trap command trap is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using trap command i.e. cleanup upon receiving signal is added. * Revert "Add example of trap command" * Add an example of trap command `trap` is a very important command to intercept a fatal signal, perform cleanup, and then exit gracefully. It needs an entry in this document. Here a simple and most common example of using `trap` command i.e. cleanup upon receiving signal is added. 2016-06-26 12:38:05 +00:00
			`## Further Reading`
[pcre] Fix examples, closes #3226 2018-10-23 21:03:51 +00:00			`[Regex101](https://regex101.com/) - Regular Expression tester and debugger`