SonicOS 7 Match Objects

About Regular Expressions

You can configure regular expressions in certain types of match objects for use in App Rules policies. The Match Object Settings options provide a way to configure custom regular expressions or to select from predefined regular expressions. The SonicWall implementation supports reassembly-free regular expression matching on network traffic. This means that no buffering of the input stream is required, and patterns are matched across packet boundaries.

SonicOS provides the following predefined regular expressions:

VISA CC VISA Credit Card Number
US SSN United States Social Security Number
CANADIAN SIN Canadian Social Insurance Number
ABA ROUTING NUMBER American Bankers Association Routing Number
AMEX CC American Express Credit Card Number
MASTERCARD CC Mastercard Credit Card Number
DISCOVER CC Discover Credit Card Number

Policies using regular expressions match the first occurrence of the pattern in network traffic. This enables actions on matches as soon as possible. Because matching is performed on network traffic and not only on human-readable text, the matchable alphabet includes the entire ASCII character set — all 256 characters.

Popular regular expression primitives such as ‘.’, (the any character wildcard), ‘*’, ‘?’, ‘+’, repetition count, alternation, and negation are supported. Though the syntax and semantics are similar to popular regular expression implementations such as Perl, vim, and others, there are some minor differences. For example, beginning (^) and end of line ($) operators are not supported. Also, ‘\z’ refers to the set of non-zero digits, [1-9], not to the end of the string as in PERL. For syntax information, see Regular Expression Syntax.

One notable difference with the Perl regular expression engine is the lack of back-reference and substitution support. These features are actually extraneous to regular expressions and cannot be accomplished in linear time with respect to the data being examined. Hence, to maintain peak performance, they are not supported. Substitution or translation functionality is not supported because network traffic is only inspected, not modified.

Predefined regular expressions for frequently used patterns such as U.S. social security numbers and VISA credit card numbers can be selected while creating the match object. Users can also write their own expressions in the same match object. Such user provided expressions are parsed, and any that do not parse correctly will cause a syntax error to display at the bottom of the Match Object Settings window. After successful parsing, the regular expression is passed to a compiler to create the data structures necessary for scanning network traffic in real time.

Regular expressions are matched efficiently by building a data structure called Deterministic Finite Automaton(DFA). The DFA’s size is dictated by the regular expression provided by the user and is constrained by the memory capacities of the device. A lengthy compilation process for a complex regular expression can consume extensive amounts of memory on the appliance. It may also take up to two minutes to build the DFA, depending on the expressions involved.

To prevent abuse and denial-of-service attacks, along with excessive impact to appliance management responsiveness, the compiler can abort the process and reject regular expressions that cause this data structure to grow too big for the device. An “abuse encountered” error message is displayed at the bottom of the window.

During a lengthy compilation, the appliance management session may become temporarily unresponsive, while network traffic continues to pass through the appliance.

Building the DFA for expressions containing large counters consumes more time and memory. Such expressions are more likely to be rejected than those that use indefinite counters such as the ‘*’ and ‘+’ operators.

Also, at risk of rejection are expressions containing a large number of characters rather than a character range or class. That is, the expression ‘(a|b|c|d|. . .|z)’ to specify the set of all lower-case letters is more likely to be rejected than the equivalent character class ‘\l’. When a range such as ‘[a-z]’ is used, it is converted internally to ‘\l’. However, a range such as

‘[d-y]’ or ‘[0-Z]’ cannot be converted to any character class, is long, and may cause the rejection of the expression containing this fragment.

Whenever an expression is rejected, the user may rewrite it in a more efficient manner to avoid rejection using some of the above tips. For syntax information, see Regular Expression Syntax. For an example discussing how to write a custom regular expression, see Creating a Regular Expression in a Match Object section in Policy > App Rules.

Was This Article Helpful?

Help us to improve our support portal

Techdocs Article Helpful form

  • Hidden
  • Hidden

Techdocs Article NOT Helpful form

  • Still can't find what you're looking for? Try our knowledge base or ask our community for more help.
  • Hidden
  • Hidden