Regular Expressions Syntax

Applies to TestComplete 15.42, last modified on September 08, 2022

About

A regular expression is a string that describes a search pattern. It defines one or several substrings to find in a text fragment. Typically, you use regular expressions to search, replace, and validate data in text. They are similar to wildcards, however, they allow specifying more vigorous search patterns.

TestComplete supports native regular expressions (provided by appropriate scripting engines) and non-native regular expressions. This topic describes the non-native regular expressions. You use them to specify the following:

Note: For information on native regular expressions, see Using Regular Expressions in Scripts.

Color coding

In this topic, the following color convention is used:

  • Regular expression pattern is highlighted with purple color.

  • Literal text to which the regular expression is applied is highlighted with olive color.

  • Literal text that matches the regular expression is highlighted with teal color.

Syntax reference

The regular expression syntax adopted in TestComplete may differ from native regular expression syntax provided by scripting engines.

Tokens

Token Description
^ Matches the beginning of a line. For instance, the ^a search pattern lets you find all lines that start with a.
$ Matches the end of a line. For instance, the a$ search pattern lets you find all lines that end with a.
. Matches any single character. For example, a.c matches abc, adc, aec but not aaac or abbc.
* Indicates that the preceding character or group matches 0 or more times. For instance, the abc*d pattern matches abd, abcd, abccd, but not a or abcx. The .* pattern matches a string of any length (including the empty string) that does not contain the newline symbol.

Note that this token is greedy, that is, it matches the longest possible string. For example, when searching on the string abbbb, ab* matches abbbb rather than ab.

+ Indicates that the preceding character or group matches 1 or more times. For instance, the ab+d pattern matches abbd or abbbd, but does not match abcd or abbcd.

Note that this token is greedy, that is, it matches the longest possible string. For example, when searching on the string abbbb, ab+ matches abbbb rather than ab or abb.

? Indicates that the preceding character or group is optional, that is, it should either match once or not to match at all. For example, abc?d will find abd and abcd, but not abccd.

Note that this token is greedy, that is, it matches the longest possible string. For example, when searching on the string abc, ab? matches ab rather than a.

??, +?, *? These tokens are non-greedy versions of ?, + and *. That is, they match the shortest possible string. For example, when searching on the string abbbb, ab*? matches a rather than ab or abbbb.
[ ] Matches any single character specified in brackets. For instance, d[ab]e matches dae or dbe and does not match dwe or dxe. To include the ] character into the search, make it either first, or last character in the range or use \]. For example, []abc], [abc]] or [ab\]cd].
[^ ] Matches any single character except for those in brackets. For instance, d[^bc]e matches dae or dxe, but not dbe or dce. d[^b-d]e matches dae, but not dbe, dce or dde.
[a-b] Matches any single character from a to b, inclusive. For instance d[k-x]e matches dke, dme and dxe, but not dze. To include the - character into the search, make it either first, or last character in the range or use \-. For example, [-ab], [abc-] or [a\-z].
[^a-b] Matches any single character not in the range a through b. For instance a[^k-x]z matches abz, aiz and ayz, but not akz.
a|b Matches either the a or b character or a group. For example, A|abc matches Abc and abc, but not A. The htm|(ml) pattern matches htm and html, but not htl or ml.
a!b Matches a not followed by b. For example, colo!ur matches color, but not colour.
( ) Groups characters. For example, (ab)+ matches ab and abab but not acb.

To specify a round bracket that should be treated literally, precede it with a backslash: \( or \).

{ } Indicates a match group. You can use braces in regular expressions that retrieve values that match the expression in the braces. If you create the following regular expression: [0-9]+-[0-9]+, it will match 125-125, but not 125-abcd. However, you can use braces to reduce the size of a regular expression. For example, you may need to modify the expression above to make it find strings containing only similar numbers that are hyphenated. For this purpose, you can specify the first part as a group and then address the value that matches this part by a zero-based index of the match group. This is the index that is specified after the backslash, \n. For example, you can change the regular expression mentioned above in the following way {[0-9]+}-\0. This means that TestComplete will replace the \0 expression with the string returned by the first match group. It will match 168-168, but not 125-168.

To specify a brace that should be treated literally, precede it with a backslash: \{ or \}.

\ A backslash indicates that the next character token (such as ?, !, *, - and others) should be treated literally. For example, \$10 matches the string $10. To match a backslash itself, use the double backslash pattern (\\).

When followed by a number (for example, \0), indicates the match group at the specified index (from 0).

In JavaScript, JScript, Python, C++Script and C#Script, you should use double backslashes to interpret the subsequent special character literally: "\\?", "\\.", "\\2" and so on. To match a backslash character, use the "\\\\" pattern.
\a Matches any alphanumeric character. This token is equivalent to [A-Za-z0-9].
\b Matches a whitespace character.
\c Matches any alphabetic character. This token is equivalent to [A-Za-z].
\d Matches any decimal digit. This token is equivalent to [0-9].
\h Matches any hexadecimal digit. This token is equivalent to [0-9A-Fa-f].
\n Matches a new line character.
\q Matches a quoted string. This token is equivalent to ("[^"]*")|('[^']*').
\w Matches a word. This token is equivalent to [a-zA-Z]+ or \c+.
\z Matches an integer number. This token is equivalent to [0-9]+ or \d+.

Sub-expressions

You can divide an expression into constituent parts or sub-expressions. To specify a sub-expression use parenthesis, for example, (\s\d+,\d+,d+,)(\d+). The parsing engine detects two sub-expressions in this expression:

  • \s\d+,\d+,d+,
  • \d+

Tthe engine assigns an index to the whole expression and to each sub-expression. The index of the expression is 0, the index of the first sub-expression is 1 and so on. For example:

  • 0: (\s\d+,\d+,d+,)(\d+)
  • 1: \s\d+,\d+,d+,
  • 2: \d+

A text fragment that matches a sub-expression is called a submatch. To address a submatch in dialogs, you can use the syntax ${nn} (where nn stands for the index of the desired sub-expression). This way, you can get parts of the regular expression.

In the Replace dialog, you can use sub-expressions to replace only a part of the sought-for expression. For an example, see the dialog description.

Modifiers

Mode modifiers specify how the engine interprets regular expressions. They toggle the engine’s behavior modes. The following modifiers are available:

Modifier Key Default State Description
i Enabled Makes the pattern match case-insensitive.
m Disabled Treats a string as multiple lines. In this mode, the caret ^ and dollar $ match before and after newlines in the subject string.
s Enabled Treats a string as a single line. In this mode, the dot matches the newline symbol.
g Enabled Controls greedy mode. Non-standard modifier.

Greedy repetition operator takes as many matching characters as possible, non-greedy takes as few as possible. For example, b+ and b* applied to string abbbbc will return bbbb, whereas b+? will return b and b*? will return an empty string.

Switching to non-greedy mode makes + work as +?, * as *? and so on.

x Disabled Permits whitespaces and comments in the pattern. Non-standard modifier.

In this mode, the whitespaces (\s) that are not backslashed nor within a character class are ignored. You can use this to break up your regular expression into more readable parts. Also the # character is treated as a metacharacter introducing a comment. For example:

  ( # This pattern matches
    (this) # the occurrence of 'this'
      | # or
    (that) # the occurrence of 'that'
  )

If you want to place a whitespace or # characters in the pattern, then you have to prefix them with / or encode them using hex notations (\xNN).

You can specify the modifier in the expression using the (?key) or (?-key) syntax (where key stands for the modifier key and minus sign specifies the disabled state for the corresponding modifier). If you try to specify unsupported modifier keys an error occurs. The modifiers can be applied to the whole expression or only to a sub-expression. For example:

(?i)Las Vegas matches Las vegas and Las Vegas
(?i)Las (?-i)Vegas matches Las Vegas but not Las vegas
(?i)(Las )?Vegas matches Las vegas and las vegas
((?i)Las )?Vegas matches las Vegas, but not las vegas

Example

The examples below demonstrate how you can use regular expressions to find a specific control or its child and to perform a click on the specified item.

JavaScript, JScript

function main()
{
  var notepad, window, dialog, control;
  WshShell.Run("notepad.exe", SW_NORMAL);
  notepad = Sys.Process("notepad");
  
  // Get Notepad's main window by using a regular expression
  window = notepad.Find("WndCaption", "regexp:.* Notepad", 5);
  
  window.MainMenu.Click("Format|Font...");
  dialog = notepad.Find("WndCaption", "Font");
  
  // Specify words that must be in the specified property
  // of the desired object by using a regular expression
  control = dialog.FindChild("wItemList", "regexp:.*Western.*Arabic.*", 5);
  
  // Specify the item to be clicked using a regular expression
  control.ClickItem("regexp:.*European$");
}

Python

def main():
  WshShell.Run("notepad.exe", SW_NORMAL);
  notepad = Sys.Process("notepad");
  
  # Get Notepad's main window by using a regular expression
  window = notepad.Find("WndCaption", "regexp:.* Notepad");
  
  window.MainMenu.Click("Format|Font...");
  dialog = notepad.Find("WndCaption", "Font");
  
  # Specify words that must be in the specified property
  # of the desired object by using a regular expression
  control = dialog.FindChild("wItemList", "regexp:.*Western.*Arabic.*", 2);
  
  # Specify the item to be clicked using a regular expression
  control.ClickItem("regexp:.*European$");

VBScript

Sub main()
 Dim notepad, window, dialog, control
  Call WshShell.Run("notepad.exe", SW_NORMAL)
 Set notepad = Sys.Process("notepad")
   
 ' Get Notepad's main window by using a regular expression
 Set window = notepad.Find("WndCaption", "regexp:.* Notepad", 5)
   
 window.MainMenu.Click("Format|Font...")
 Set dialog = notepad.Find("WndCaption", "Font")
   
 ' Specify words that must be in the specified property
 ' of the desired object by using a regular expression
 Set control = dialog.FindChild("wItemList", "regexp:.*Western.*Arabic.*", 5)
   
 ' Specify the item to be clicked using a regular expression
 control.ClickItem("regexp:.*European$")
End Sub

DelphiScript

procedure main();
var notepad, window, dialog, control;
begin
  WshShell.Run('notepad.exe', SW_NORMAL);
  notepad := Sys.Process('notepad');
   
  // Get Notepad's main window by using a regular expression
  window := notepad.Find('WndCaption', 'regexp:.* Notepad', 5);
   
  window.MainMenu.Click('Format|Font...');
  dialog := notepad.Find('WndCaption', 'Font');
   
  // Specify words that must be in the specified property
  // of the desired object by using a regular expression
  control := dialog.FindChild('wItemList', 'regexp:.*Western.*Arabic.*', 5);
   
  // Specify the item to be clicked using a regular expression
  control.ClickItem('regexp:.*European$');
end;

C++Script, C#Script

function main()
{
  var notepad, window, dialog, control
  WshShell["Run"]("notepad.exe", SW_NORMAL);
  notepad = Sys["Process"]("notepad");
   
  // Get Notepad's main window by using a regular expression
  window = notepad["Find"]("WndCaption", "regexp:.* Notepad", 5);
   
  window["MainMenu"].Click("Format|Font...");
  dialog = notepad["Find"]("WndCaption", "Font");
   
  // Specify words that must be in the specified property
  // of the desired object by using a regular expression
  control = dialog["FindChild"]("wItemList", "regexp:.*Western.*Arabic.*", 5);
   
  // Specify the item to be clicked using a regular expression
  control["ClickItem"]("regexp:.*European$");
}

See Also

Using Regular Expressions in Scripts

Highlight search results