SonicOS 7 Match Objects

Technical Documentation > SonicOS 7 Match Objects > Match Objects > About Match Objects > About Regular Expressions

About Regular Expressions

You can configure regular expressions in certain types of match objects for use in App Rules policies. The Match Object Settings options provide a way to configure custom regular expressions or to select from predefined regular expressions. The SonicWall implementation supports reassembly-free regular expression matching on network traffic. This means that no buffering of the input stream is required, and patterns are matched across packet boundaries.

SonicOS provides the following predefined regular expressions:

VISA CC	VISA Credit Card Number
US SSN	United States Social Security Number
CANADIAN SIN	Canadian Social Insurance Number
ABA ROUTING NUMBER	American Bankers Association Routing Number
AMEX CC	American Express Credit Card Number
MASTERCARD CC	Mastercard Credit Card Number
DISCOVER CC	Discover Credit Card Number

Policies using regular expressions match the first occurrence of the pattern in network traffic. This enables actions on matches as soon as possible. Because matching is performed on network traffic and not only on human-readable text, the matchable alphabet includes the entire ASCII character set — all 256 characters.

Popular regular expression primitives such as ‘.’, (the any character wildcard), ‘*’, ‘?’, ‘+’, repetition count, alternation, and negation are supported. Though the syntax and semantics are similar to popular regular expression implementations such as Perl, vim, and others, there are some minor differences. For example, beginning (^) and end of line ($) operators are not supported. Also, ‘\z’ refers to the set of non-zero digits, [1-9], not to the end of the string as in PERL. For syntax information, see Regular Expression Syntax.

One notable difference with the Perl regular expression engine is the lack of back-reference and substitution support. These features are actually extraneous to regular expressions and cannot be accomplished in linear time with respect to the data being examined. Hence, to maintain peak performance, they are not supported. Substitution or translation functionality is not supported because network traffic is only inspected, not modified.

Predefined regular expressions for frequently used patterns such as U.S. social security numbers and VISA credit card numbers can be selected while creating the match object. Users can also write their own expressions in the same match object. Such user provided expressions are parsed, and any that do not parse correctly will cause a syntax error to display at the bottom of the Match Object Settings window. After successful parsing, the regular expression is passed to a compiler to create the data structures necessary for scanning network traffic in real time.

Regular expressions are matched efficiently by building a data structure called Deterministic Finite Automaton(DFA). The DFA’s size is dictated by the regular expression provided by the user and is constrained by the memory capacities of the device. A lengthy compilation process for a complex regular expression can consume extensive amounts of memory on the appliance. It may also take up to two minutes to build the DFA, depending on the expressions involved.

To prevent abuse and denial-of-service attacks, along with excessive impact to appliance management responsiveness, the compiler can abort the process and reject regular expressions that cause this data structure to grow too big for the device. An “abuse encountered” error message is displayed at the bottom of the window.

During a lengthy compilation, the appliance management session may become temporarily unresponsive, while network traffic continues to pass through the appliance.

Building the DFA for expressions containing large counters consumes more time and memory. Such expressions are more likely to be rejected than those that use indefinite counters such as the ‘*’ and ‘+’ operators.

Also, at risk of rejection are expressions containing a large number of characters rather than a character range or class. That is, the expression ‘(a|b|c|d|. . .|z)’ to specify the set of all lower-case letters is more likely to be rejected than the equivalent character class ‘\l’. When a range such as ‘[a-z]’ is used, it is converted internally to ‘\l’. However, a range such as

‘[d-y]’ or ‘[0-Z]’ cannot be converted to any character class, is long, and may cause the rejection of the expression containing this fragment.

Whenever an expression is rejected, the user may rewrite it in a more efficient manner to avoid rejection using some of the above tips. For syntax information, see Regular Expression Syntax. For an example discussing how to write a custom regular expression, see Creating a Regular Expression in a Match Object section in Policy > App Rules.

< Previous Section Next Section >

Was This Article Helpful?

Help us to improve our support portal

Yes! Not Really

SonicOS 7 Match Objects

About Regular Expressions

Was This Article Helpful?

Techdocs Article Helpful form

Techdocs Article NOT Helpful form

Follow Us

Subscribe to newsletter

Stay In Touch