| |
|
 |
Regular Expressions explained
|
Sequences
Last we have sequences which defines sequences of characters which can match, sometimes you don't want match a word directly but rather something that resembles one. The sequence characters are
any characters put inside the sequence brackets are treated as a literal character, even metacharacters. The only special characters are the - which denotes character ranges and the ^ which is used to negate a sequence. The sequence is somewhat similar with alternation, the similarity is that only one of the items listed will match. For instance
will match any small characters which are in the English alphabet (a to z). Another common sequence is
which matches any small or capital characters in the English alphabet as well as numbers. Sequences are also mixed with quantifiers and assertions to produce more elaborate searches. For instance
matches all whole words. This will match
cow
Linus
regular
expression | but will not match
Now what if you wanted to find anything but words, the expression
would find any sequences of characters which does not contain the English alphabet or any numbers.
Some implementations of regular expressions allows you to use shorthand versions for commonly used sequences, they are:
\d, a digit [0-9]
\D, a non-digit [^0-9]
\w, a word (alphanumeric) [a-zA-Z0-9]
\W, a non-word [^a-zA-Z0-9]
\s, a whitespace [ \t\n\r\f]
\S, a non-whitespace [^ \t\n\r\f] |
Comment List
| Topic: |
Author: |
Time: |
|
another great regexp tool
|
S Church
|
01.03.2005 16:16
|
|
There's a free-as-in-beer development environment for Windows called HTML-Kit that's just great for writing scripts and web code. The Find or Find / Replace functions have a check box for Regexps, with a "Find All" button to highlight every instance matched by a regexp. The only drawback is that it assumes /is (case insensitivity and multiline).
VisualREGEXP mentioned in the article says it has no required supporting files, that the standalone executable is all that's needed. However, most Windows machines don't have the TCL/TK component "wish," which the README file claims is necessary for operation. Wish might be available somewhere online as a precompiled binary without having to install all of TCL/TK, but I'm not motivated enough to google it at the moment.
|
|
Email match
|
David Robarts
|
15.01.2005 22:45
|
|
Some valid email addresses will fail this expression (and some invalid addresses pass).
[a-z0-9_-]+(.[a-z0-9_-]+)*@[a-z0-9_-]+(.[a-z0-9_-]+)+
The underscore character is not allowed in the domain part of the email address and some additional characters are allowed in the username part.
This might be better:
[a-z0-9_-]+(.[a-z0-9_-+]+)*@[a-z0-9-]+(.[a-z0-9-]+)+
|
|
can't see the graphic
|
x x
|
02.11.2001 01:59
|
|
I can't see the graphic towards the bottom to demonstrate the usage of < >
|
|
 |
|
|