What are regular expressions

W

A regular expression or regex is a method of identifying a string from a given text according to certain rules. A string that consist of special characters physically represents a regex.

Rules:
1. The fundamental block is a single character and selects itself.
2. A bracket expression is a list of characters between “[” and “]” and describes a certain character in that list.
3. If the first character is ^ (which) describes any character that is not in the list.
Example: regex [0123456789] – represents a single digit, and [^ 0123456789] represents any character that is not the figure.
4. Inside a square bracket expression, a “range expression” represents 2 separated characters by – (minus) and selects (identifies) any character that is found between the two characters in the range expression.
5. ^ (butt) outside the straight brackets is a meta character that identifies the beginning of a line (if it is the first character in the regex).
6. $ represents a meta character that identifies the end of a line (if it is the last character in the regex).

Example
^ ab – line starting with ab
ab $ – line ending with ab \
^ $ – empty line
^ a [a-z0-9] – a line starting with “a” followed by any character between “a” and “z” or “0”
7. backslash (“\”) followed by a special character selects the special character.
Special characters are:., *, [, \ (Point, asterisk, right parenthesis and backslash)
The special character “.” (point) selects any character less NEWLINE (empty line)
8. “?” (the question mark) selects 0 or 1 previous characters
9. * (asterix) selects zero (0) or more previous characters
10. + (only for regular extended expressions) selects the previous character one or more times
11. {n} The previous item is selected by n times (exactly)
12. {n,} The previous item is selected by n or multiple times
13. {n, m} The previous item is selected at least n times but not more than once

When using braces we must precede them with a backslash (“\”)

Regular expressions are used in search engines, search and replace dialogs of processors and text editors in text processing tools such as Sed and AWK. Many programming languages ​​provide regex capabilities either built-in or through libraries.

The concept emerged in the 1950s when American mathematician Stephen Cole Kleene formalized the description of a regular language. The concept has come into common use with the Unix text processing tools.

Today, there are different syntax for regular expressions, one being the Portable Operating System Interface (d) and the other, widely used, being the Perl syntax.
Today, regex is widely supported in programming languages, text processing programs (special lexis), advanced text editors, and other programs. Regex support is part of the standard library of many programming languages, including Java and Python, and is also integrated into the syntax of others, including Perl and ECMAScript. The implementations of the regex functionality are often called regex engines, and there are several libraries available for reuse.

Recent Posts

Archives

Categories