Using Optical Character Recognition - Overview

Applies to TestComplete 15.69, last modified on November 13, 2024
In version 12.60, this feature was upgraded to a new plugin powered by Google Cloud Vision API. To learn more, see Optical Character Recognition.
The deprecated OCR plugin was removed from TestComplete in version 12.60. If you need to use the deprecated OCR plugin with this version of TestComplete, please contact our Customer Care team. The deprecated OCR plugin was restored in TestComplete version 14.0. To use the plugin with this or later TestComplete version, you need to install and enable the plugin manually.

This topic contains information about the TestComplete optical character recognition feature. It includes the following sections:

Basic Concepts

Optical Character Recognition is the process of translating images of typewritten text into computer readable text. TestComplete can capture an image of an application screen and use OCR to “read” the text in it and convert it to usable ASCII or Unicode text. This text can be used to create solid, reliable tests. Another example of the character recognition usage in testing is searching for the needed text within the captured image of an application screen.

The TestComplete OCR feature can be scripted using the OCR object. This object is available if the OCR plugin is installed.

The OCR object only contains one method: CreateObject. It accepts a captured screen image that contains the text to be recognized and returns the OCRObject object that is used to perform the text recognition.

To start the character recognition process, call either the OCRObject.GetText method or the OCRObject.FindRectByText method. GetText returns all OCR readable text from the image. FindRectByText takes a string parameter and tries to locate that text in the image. If it can find the text, it will return True and the FoundLeft, FoundTop, FoundHeight, FoundWidth, FoundX and FoundY properties of the OCRObject will contain the parameters of the region where the text was found. You can pass the OCROptions object that stores the recognition settings you have configured to both of these methods.

Below is a simple, one-line TestComplete script that grabs an image of the active window and posts all readable text found in it to the test log.

JavaScript, JScript

Log.Message(OCR.CreateObject(Sys.Desktop.ActiveWindow()).GetText());

Python

Log.Message(OCR.CreateObject(Sys.Desktop.ActiveWindow()).GetText())

VBScript

Log.Message OCR.CreateObject(Sys.Desktop.ActiveWindow).GetText

DelphiScript

Log.Message(OCR.CreateObject(Sys.Desktop.ActiveWindow).GetText);

C++Script, C#Script

Log["Message"](OCR["CreateObject"](Sys["Desktop"]["ActiveWindow"]())["GetText"]());

You can find more sample scripts in the OCRObject.GetText and OCRObject.FindRectByText topics.

TestComplete can recognize 52 lower-case and upper-case Latin characters, 10 digits, and 31 special characters in almost any font, size or style. Recognition of native-language characters, for instance, Chinese, Japan or Korean hieroglyphs or Cyrillic characters, is not supported. Also, TestComplete does not recognize text written with special-symbol fonts like Wingdings, Webdings or Marlett.

Font smoothing plays an important role in text recognition. It serves to enhance the text’s appearance and draws letters using multiple colors rather than one color. However, this decreases the accuracy of optical text recognition, so, we recommend that you set the Standard smoothing method or disable font smoothing. By default, Windows uses the ClearType smoothing method, and because of this, TestComplete is unable to recognize text displayed with the default font settings. See Using Optical Character Recognition - Tips for instructions on how to change the system’s font settings.

To successfully read onscreen text, TestComplete has to create a common ground between the installed Windows fonts and the captured image of the application text. Recognizing any installed font on a Windows PC with dozens or even hundreds of installed fonts would waste valuable processing time. TestComplete creates and uses font collections to limit the readable fonts to the ones needed for the tested application. The default font collection is Arial, Courier New, Fixedsys, Garamond, MS Sans Serif, Segoe UI, System, Tahoma and Times Roman each in five sizes (8, 10, 12, 14, 16) and four styles (regular, bold, italic and underlined). You can also create a custom font collection with any combination of installed fonts, sizes, styles and colors.

To create a custom font collection and change other recognition settings, use methods and properties of the OCROptions object. To obtain this object, call the CreateOptions method of the OCRObject object.

If the custom font collection is used during character recognition, all fonts of the default font collection will be ignored. But, you can still obtain the default settings, including the default font collection, via the OCRObject.DefaultOptions property. This allows you to quickly switch between custom and default settings.

If any of the fonts defined in the custom or default collection does not exist on the current computer, character recognition fails. So, if you want to run a TestComplete test that employs OCR on another computer, you have to make sure that this machine has all the fonts for that test.

You can use the default recognition settings, which saves you from creating the OCROptions object. However, using a custom font collection improves the recognition accuracy and speed.

General Recognition Procedure

The general recognition procedure includes the following steps:

  1. Obtain the scripting object of the window that contains the text to be recognized.

  2. Call the OCR.CreateObject method to create the OCRObject object.

  3. (Optional) Change the recognition settings. To do this, call the OCRObject.CreateOptions method. It will create a new OCROptions object. Then you can use this object to create a custom font collection and specify other recognition settings.

    To access the fonts collection, use the OCROptions.Fonts property.

    Each element of the fonts collection is a Font object that corresponds to a possible font, in which the text can be written. For each font you specify the name and possible sizes and styles that will be used for recognition. To do this, use the Font.Sizes and Font.Styles properties that provide access to the collection of possible sizes and styles correspondingly. For instance, each element of the styles collection corresponds to a possible font style: regular, bold, italic or underlined. If the collection contains an element corresponding to the specified style, the OCR engine will fill the character tables with data for the appropriate style. Using the normal font style does not require the addition of elements to the styles collection.

    There is a limitation of the OCR engine: font sizes for character recognition must be in the 6 - 30 range. Font sizes less than 6 or greater than 30 will cause errors.
  4. Call the OCRObject.GetText or OCRObject.FindRectByText to perform the recognition.

The following code describes the above mentioned steps:

JavaScript, JScript

function Test()
{
  var p, w, OCRObj, OCROptions, Font, s;

  // Obtain the window containing the text to be recognized
  p = Sys.Process("MyApplication");
  w = p.Window("MyWndClass", "MyWndCaption", 1).Window("MyControlClass", "*", 1);

  // Create a new OCRObject object
  OCRObj = OCR.CreateObject(w);

  // Create a new OCROptions object to modify recognition settings
  OCROptions = OCRObj.CreateOptions();

  // The following code modifies the recognition settings

  // Add a new font to the Fonts collection
  Font = OCROptions.Fonts.Add();
  Font.Name = "Tahoma"; // Specify the font name
  Font.Styles.Add(1); // Add the Bold style
  Font.Styles.Add(2); // Add the Italic style
  Font.Sizes.Add(8); // Add the 8 pt size
  Font.Sizes.Add(12); // Add the 12 pt size

  // Add one more font to the Fonts collection
  Font = OCROptions.Fonts.Add();
  Font.Name = "Verdana"; // Specify the font name
  Font.Styles.Add(1); // Add the Bold style
  Font.Styles.Add(2); // Add the Italic style
  Font.Sizes.Add(8); // Add the 8 pt size
  Font.Sizes.Add(12); // Add the 12 pt size

  // Recognize text
  s = OCRObj.GetText(OCROptions);

  // Post results to the log
  Log.Message(s);
}

Python

def Test():

  # Obtain the window containing the text to be recognized 
  p = Sys.Process("MyApplication")
  w = p.Window("MyWndClass", "MyWndCaption", 1).Window("MyControlClass", "*", 1)

  # Create a new OCRObject object
  OCRObj = OCR.CreateObject(w)

  # Create a new OCROptions object to modify recognition settings
  OCROptions = OCRObj.CreateOptions()

  # The following code modifies the recognition settings

  # Add a new font to the Fonts collection
  Font = OCROptions.Fonts.Add()
  Font.Name = "Tahoma" # Specify the font name
  Font.Styles.Add(1) # Add the Bold style
  Font.Styles.Add(2) # Add the Italic style
  Font.Sizes.Add(8) # Add the 8 pt size
  Font.Sizes.Add(12) # Add the 12 pt size

  # Add one more font to the Fonts collection
  Font = OCROptions.Fonts.Add()
  Font.Name = "Verdana" # Specify the font name
  Font.Styles.Add(1) # Add the Bold style
  Font.Styles.Add(2) # Add the Italic style
  Font.Sizes.Add(8) # Add the 8 pt size
  Font.Sizes.Add(12) # Add the 12 pt size

  # Recognize text
  s = OCRObj.GetText(OCROptions);

  # Post results to the log
  Log.Message(s)

VBScript

Sub Test
  ' Obtain the window containing the text to be recognized
  Set p = Sys.Process("MyApplication")
  Set w = p.Window("MyWndClass", "MyWndCaption", 1).Window("MyControlClass", "*", 1)


  ' Create a new OCRObject object
  Set OCRObj = OCR.CreateObject(w)

  ' Create a new OCROptions object to modify recognition settings
  Set OCROptions = OCRObj.CreateOptions

  ' The following code modifies the recognition settings

  ' Add a new font to the Fonts collection
  Set Font = OCROptions.Fonts.Add
  Font.Name = "Tahoma" ' Specify the font name
  Font.Styles.Add 1 ' Add the Bold style
  Font.Styles.Add 2 ' Add the Italic style
  Font.Sizes.Add 8 ' Add the 8 pt size
  Font.Sizes.Add 12 ' Add the 12 pt size

  ' Add one more font to the Fonts collection
  Set Font = OCROptions.Fonts.Add
  Font.Name = "Verdana" ' Specify the font name
  Font.Styles.Add 1 ' Add the Bold style
  Font.Styles.Add 2 ' Add the Italic style
  Font.Sizes.Add 8 ' Add the 8 pt size
  Font.Sizes.Add 12 ' Add the 12 pt size

  ' Recognize text
  s = OCRObj.GetText(OCROptions)

  ' Post results to the log
  Log.Message s
End Sub

DelphiScript

procedure Test;
var p, w, OCRObj, OCROptions, Font, s;
begin
  // Obtain the window containing the text to be recognized
  p := Sys.Process('MyApplication');
  w := p.Window('MyWndClass', 'MyWndCaption', 1).Window('MyControlClass', '*', 1);

  // Create a new OCRObject object
  OCRObj := OCR.CreateObject(w);

  // Create a new OCROptions object to modify recognition settings
  OCROptions := OCRObj.CreateOptions;

  // The following code modifies the recognition settings

  // Add a new font to the Fonts collection
  Font := OCROptions.Fonts.Add();
  Font.Name := 'Tahoma'; // Specify the font name
  Font.Styles.Add(1); // Add the Bold style
  Font.Styles.Add(2); // Add the Italic style
  Font.Sizes.Add(8); // Add the 8 pt size
  Font.Sizes.Add(12); // Add the 12 pt size

  // Add one more font to the Fonts collection
  Font := OCROptions.Fonts.Add();
  Font.Name := 'Verdana'; // Specify the font name
  Font.Styles.Add(1); // Add the Bold style
  Font.Styles.Add(2); // Add the Italic style
  Font.Sizes.Add(8); // Add the 8 pt size
  Font.Sizes.Add(12); // Add the 12 pt size

  // Recognize text
  s := OCRObj.GetText(OCROptions);

  // Post results to the log
  Log.Message(s);
end;

C++Script, C#Script

function Test()
{
  var p, w, OCRObj, OCROptions, Font, s;

  // Obtain the window containing the text to be recognized
  p = Sys["Process"]("MyApplication");
  w = p["Window"]("MyWndClass", "MyWndCaption", 1)["Window"]("MyControlClass", "*", 1);

  // Create a new OCRObject object
  OCRObj = OCR["CreateObject"](w);

  // Create a new OCROptions object to modify recognition settings
  OCROptions = OCRObj["CreateOptions"]();

  // The following code modifies the recognition settings

  // Add a new font to the Fonts collection
  Font = OCROptions["Fonts"]["Add"]();
  Font["Name"] = "Tahoma"; // Specify the font name
  Font["Styles"]["Add"](1); // Add the Bold style
  Font["Styles"]["Add"](2); // Add the Italic style
  Font["Sizes"]["Add"](8); // Add the 8 pt size
  Font["Sizes"]["Add"](12); // Add the 12 pt size

  // Add one more font to the Fonts collection
  Font = OCROptions["Fonts"]["Add"]();
  Font["Name"] = "Verdana"; // Specify the font name
  Font["Styles"]["Add"](1); // Add the Bold style
  Font["Styles"]["Add"](2); // Add the Italic style
  Font["Sizes"]["Add"](8); // Add the 8 pt size
  Font["Sizes"]["Add"](12); // Add the 12 pt size

  // Recognize text
  s = OCRObj["GetText"](OCROptions);

  // Post results to the log
  Log["Message"](s);
}

Note: The first call to the TestComplete optical character recognition method OCRObject.GetText or OCRObject.FindRectByText may take some time because TestComplete needs to create master character tables for all the fonts used in the recognition session. If the font settings are not changed, further calls to these methods are performed faster. If you change the font settings used in the recognition session, all of the master character tables will be created new.

How Recognition Works

The character recognition process in TestComplete consists of several steps. The first step, which is used to prepare an application’s screen image, is called fragmentation. Fragmentation helps to simplify the internal representation of the image, identify the recognizable elements, and separate the text fragments that are offset relative to each other. It locates the rectangular regions within the given color or grayscale screen image and tries to find several non-intersecting rectangular fragments, each with its own predominant color (typically, it is the background color).

Note: Sometimes fragmentation does not lead to the expected results. In this instance, you can disable it for the given recognition session.

Then TestComplete transforms the “fragmented” or entire screen image to a binary representation. Every pixel becomes completely black and white with no shades of gray. This transition of a color or grayscale image to a duotone image is called binarization. During binarization, each pixel of a color image (RGB) is first transformed to a grayscale pixel according to the formula K = 0.299R + 0.587G + 0.114B. Then the obtained value K is compared with the predefined Grayscale Binarization Threshold (0-255, default is 130) and the pixel is assumed to be either black or white. The simple black and white image of each character is the common ground used to compare the contents of the font collection to the contents of the application’s screen.

TestComplete compares the duotone image of a character being recognized with each of the available master characters and assumes that this is a representation of the character that least differs with this image. So, if an acceptable match is found, the character is “read”. If the difference between the character image and the nearest master character is larger than the predefined value, TestComplete assumes that the character is not recognized. The value of Recognition Rejection Threshold can be either set manually or calculated automatically according to the formula M = 0.05 + 0.02FontNamesNum, where FontNamesNum is the number of fonts in the user-defined collection. If the default font collection is used or the number of fonts in the user-defined collection is larger than 7, FontNamesNum is assumed to be 7.

When TestComplete is searching for occurrences of the specified text, that is when the OCRObject.FindRectByText method is called, it can search for the exact match of the given string or for a string that is similar to the given string. This behavior is controlled by the OCROptions.ExactSearch setting. The amount of string similarity is set by the OCROptions.SearchAccuracy option.

Improving Recognition Accuracy

In general, optical character recognition tools do not guarantee 100% recognition accuracy in all the cases and the TestComplete OCR engine is not an exclusion. Font style, size, color and some other factors may lead to erroneous text recognition. For more information on these factors and on how to improve the recognition accuracy, see Recognition Tips.

One Alternative to OCR

In most cases, the use of the OCR engine is part of a bigger test that compares the recognized text against a baseline value or simulates user actions over a control that contains this text. That is, you use the OCR engine to find an object or its text and then perform other test actions.

The TestComplete OCR engine tries to “read” the text displayed on the screen. The recognition accuracy depends on the font that is used to display the text, the text size, its style and attributes. TestComplete includes the Text Recognition plugin that works on some other principles. The plugin intercepts certain calls to Windows API functions and recognizes objects within tested applications by their text.

The Text Recognition plugin works faster than the OCR engine and, if used, provides 100% recognition accuracy. So, before using the OCR methods, we recommend that you try working with your text objects by using the Text Recognition plugin. For more information, see Optical Character Recognition vs. Text Recognition Technology.

Using OCR in Mobile Applications

The OCR engine can work with Android and iOS applications in the same way it works with desktop applications. However, the OCR plugin does not work reliably with mobile devices. It is recommended to use different approaches for working with mobile applications.

See Also

Optical Character Recognition

Highlight search results