| Sign In | Sign Out | Subscribe to Mailing Lists | Unsubscribe or Change Settings | Help |
smoe.org mailing lists
Introduction to Patterns
Patterns are used by various commands and configuration settings:
By the archive-sync command, to match archive names.
By the lists and rekey commands, to match list names.
By the set-pattern, unregister-pattern, unsubscribe-pattern, which,
and who commands, to match e-mail addresses.
By the access_rules, advertise, bounce_probe_pattern, bounce_rules,
delivery_rules, noadvertise, and post_limits settings, to match
e-mail addresses.
By the admin_body and taboo_body settings, to match lines in
the body of a posted message.
By the admin_headers and taboo_headers settings, to match lines in
the headers of a posted message.
By the attachment_filters and attachment_rules settings, to match
message content types.
By the quote_pattern setting, to count the lines in the body of a
posted message that are marked as being written by someone else.
By the signature_separator setting, to match the beginning of
an e-mail signature.
There are four supported types of pattern, described below:
Substring Patterns, like "example"
Glob Patterns, like %example%
Regular Expressions, like /example/
Undelimited Patterns, like example
Several examples of regular expressions are illustrated:
Example 1 - a list of special characters
Example 2 - escaping '.' is required
Example 3 - escaping '@' is required
Example 4 - matching the beginning and end of string
Example 5 - matching anything and everything
Example 6 - escaping '*' is required
Example 7 - case sensitivity
Example 8 - overly safe escaping doesn't hurt
Example 9 - matching (or NOT matching) white space
Example 10 - negated or inverted matches
Majordomo is written in the Perl programming language. Perl regular
expressions are a powerful but complicated tool for pattern matching.
To eliminate some of the complexity, three simpler forms of pattern
matching are provided, in addition to full Perl regular expressions.
A pattern is usually enclosed in "delimiters," with optional "modifiers"
outside the delimiters. The delimiters indicate where the pattern begins
and ends, and the modifiers change how matches are found. For example,
in the pattern:
"example.net"i
the delimiters are quotes, and the 'i' is a modifier. The most common
modifier, the letter 'i', makes the matching case-insensitive, meaning
that small and capital letters are considered identical.
The negation modifier, '!', may be used to invert any of the four
kinds of pattern. For example,
!edu
would match any string of characters that does not contain "edu".
The special pattern
ALL
will match everything.
Substring Patterns
------------------
Examples: "example.com"
"user@somewhere.example.com"i
The delimiter is a double quote. There are no special characters; the
pattern matches if the pattern occurs anywhere within the text to be
matched. A trailing 'i' specifies that the matching is case-insensitive.
For instance,
"bsc" would match unsubscribe
"bsc" would not match unsuBsCribe
"bsc"i would match unsuBsCribe
Glob Patterns
-------------
Examples: %user@*example.com%i
%u-???@*example.com%i
The delimiter is a percent sign. These patterns are reminiscent of
file-matching patterns from the DOS and Unix command line interfaces.
Special characters include:
? matches any single character
* matches any number (including zero) of any character.
[] are used to define character classes. For instance,
[abc] will match any one of the letters a, b, or c. This
style of grouping has the same effect as in regular expressions.
Regular Expressions
-------------------
What follows is a basic discussion of Perl regular expressions.
There is one important difference between Majordomo regular expressions
and Perl regular expressions: in Perl version 5 and above, the
'@' character should be "escaped" with a backslash, \@. Majordomo
will compensate if you forget to add the backslash, but for
the sake of correctness you should always include it when you
are trying to match a literal '@' symbol.
Example 1 - a list of special characters
A regular expression is a concise way of expressing a pattern in
a series of characters. The full power of regular expressions can
make some difficult tasks quite easy, but we will only brush the
surface here.
The character / is used to mark the beginning and end of a regular
expression. Letters and numbers stand for themselves. Many of the
other characters are symbolic. Some commonly used ones are:
! negates what follows, matching when the expression does NOT
\@ the `@' found in nearly all addresses; it must be preceded
by a backslash to avoid errors.
. (period) any character
* previous character, zero or more times; note especially...
.* any character, zero or more times
+ previous character, one or more times; so for example...
a+ letter "a", one or more times
\ next character stands for itself; so for example...
\. literally a period, not meaning "any character"
^ beginning of the string; so for example...
^a a string beginning with letter "a"
$ end of the string; so for example...
a$ a string ending with letter "a"
Example 2 - escaping '.' is required
/foo\.example\.com/
Notice that the periods are preceded by a backslash so that they are
interpreted as periods, rather than wildcards. This matches any string
containing:
foo.example.com
such as:
foo.example.com
bar.foo.example.com
user@bar.foo.example.com
users%bar.foo.example.com@example.com
Example 3 - escaping '@' is required
/johndoe\@.*foo\.example\.com/
The `@' has special meaning to Perl and should be prefixed with a backslash
to avoid errors. The string ".*" means "any character, zero or more
times". So this matches:
johndoe@foo.example.com
johndoe@terminus.foo.example.com
ajohndoe@terminus.foo.example.com
But it doesn't match:
johndoe@example.com
brent@foo.example.com
Example 4 - matching the beginning and end of string
/^johndoe\@.*cs\.example\.org$/
This is similar to Example 4.3, and matches the same first two strings:
johndoe@foo.example.org
johndoe@terminus.foo.example.org
But it doesn't match:
ajohndoe@terminus.foo.example.org
...because the regular expression says the string has to begin with
letter "j" and end with letter "g", by using the ^ and $ symbols, and
neither of those is true for ajohndoe@terminus.foo.example.org@example.com.
Example 5 - matching anything and everything
/.*/
This is the regular expression that matches anything
(any character, zero or more times).
Example 6 - escaping '*' is required
/.\*johndoe/
Here the * is preceded by a \, so it refers literally to an asterisk
character and not the symbolic meaning "zero or more times". The '.' still
has its symbolic meaning of "any one character", so it would match:
a*johndoe
s*johndoe
Because the . by itself implies one character, it would not match:
*johndoe
Example 7 - case sensitivity
Normally all matches are case sensitive; you can make any match case
insensitive by appending an `i' to the end of the expression.
/example\.com/i
This would match example.com, EXAMPLE.com, ExAmPlE.cOm, etc. Removing the `i':
/example\.com/
...would match example.com but not EXAMPLE.com or any other capitalization.
Example 8 - overly safe escaping doesn't hurt
To be on the safe side put a \ in front of any characters in the
regular expressions that are not numbers or letters. In order to put
a / into the regular expression, the same rule holds: precede it
with a \. Thus, with \ in front of the / and = characters, this:
/\/CO\=US/
...matches /CO=US and may be a useful regular expression to those of you
who need to deal with X.400 addresses that contain / characters.
Example 9 - matching (or NOT matching) white space
Normally, all whitespace within a pattern is matched verbatim, but it is
sometimes desirable to add some additional space within a pattern to make
it more readable. For instance, here is a pattern matching some common
quoting characters in email:
/^(-|:|>|[a-z]+>)/i
This can be a bit difficult to follow, so we can space it out a bit:
/^( - | : | > | [a-z]+> )/xi
The 'x' modifier specifies that whitespace is to be ignored, and makes the
pattern a bit easier to read. If you want to match actual whitespace, use
'\s'.
Note that the 'x' modifier provides additional functionality to Perl code
relating to comments, but because Majordomo requires patterns to lie all on
a single line, this is not significant here.
Example 10 - negated or inverted matches
Negated matches (like !/^sub/) work in places where they have meaning, such
as the taboo expression matcher which has lots of complicated logic to handle
them, but not all places. Majordomo patterns just get sent through a function
that turns them into regular expressions... which may or may not make sense
in the context you want to use them.
For example
who-regexp listname !/xxx\.com/
will produce a list of subscribers to "listname" that are NOT from the
'xxx.com' domain. Be careful to escape the period, which otherwise will
match any character, not just a period.
Undelimited Patterns
--------------------
In the previous sections, all of the patterns were considered to be
enclosed in quotes, slashes, or percent signs. It is legitimate
to use patterns without enclosing them in those delimiters in some
cases. However, the kind of matching done will depend upon where
the pattern is used.
In the archive-sync command, an exact match.
In the lists and rekey commands, an exact, case-insensitive match.
In the which and who commands, a case-insensitive substring match.
In the attachment_filters setting, an exact, case-insensitive match.
In the attachment_rules setting, an exact, case-insensitive match.
In the post_limits setting, a case-insensitive substring match.
In all of the other cases mentioned in the first section, pattern
delimiters are required. Using a pattern without delimiters will
cause an error.
See Also:
help admin
help archive
help configset_access_rules
help configset_advertise
help configset_admin_body
help configset_admin_headers
help configset_attachment_filters
help configset_attachment_rules
help configset_bounce_probe_pattern
help configset_bounce_rules
help configset_delivery_rules
help configset_noadvertise
help configset_post_limits
help configset_quote_pattern
help configset_signature_separator
help configset_taboo_body
help configset_taboo_headers
help lists
help overview
help rekey
help set
help unregister
help unsubscribe
help which
help who
This is the "patterns" help document for
Majordomo 2, version 0.1200401130.
For a list of all help documents, send the following command:
help topics
in the body of a message to mj2@smoe.org.
For assistance, please contact the smoe.org administrators.
| Sign In | Sign Out | Subscribe to Mailing Lists | Unsubscribe or Change Settings | Help |