regex automation

Integrating Regex in Robotic Process Automation (RPA)

What is Regex?

Regular Expression, REGEX, or regexp, sometimes called a rational expression, can be defined as a sequence of characters that form a search pattern.

It’s like sorcery to a layman, something they don’t really comprehend, but want to make use of anyway, for it’s a very powerful tool, especially when used aptly and for the right purpose.

Typical use of Regex:
  • Input Validation
  • String Parsing
  • Syntax Highlighting
  • Data Scraping
  • Search
  • String manipulation
  • Data Mapping

Currency formats vary across invoices, and let’s say there’s a need to identify strings in the following 3 formats:

  • $3,000,000.00
  • $3,000,000
  • 3,000,000

This looks like a big challenge for anyone who doesn’t know regex. But if you know it, it will always be your first choice.

Here is what you would end up with to achieve this task –

\$?\b(\d{1,3}(,?\d{3})*(\.\d{2})?)\b

And, when you implement this, the task will be a cakewalk!

Regex in Automation

Now the main idea of this blog is to understand how this “sorcery” called Regex, can further be integrated in an automation tool (like UiPath). Nearly all the top programming languages today like Java, C, C++, C#, VB.NET, PHP, Python etc, support regex. This makes it easier for it to be used in any automation application compatible with such languages.

Regex in UiPath

UiPath supports both VB.NET and C#, though within a workflow, only the former is supported. This is a limitation of Microsoft Windows Workflow Foundation on which UiPath Studio is based. Coding for custom activities, however, can be done using the two languages, BOTH of which support regex.

There are three ways of using regex in UiPath:

  1. Using Predefined Activities (there are three which makes use of regex: Matches, Is Match, and Replace in Programming activities of the package:Core.Activities)
  2. Creating Custom Activities.
  3. Using .net methods (either static methods of Regex class or methods of a Regex object.) directly in the workflow
Predefined Activities

Matches activity searches an input string for all occurrences of a regular expression (which is in string format) and returns all the successful matches as an IEnumerable collection of type <System.Text.RegularExpressions.Match>, unlike the .net method Matches(InputString, pattern), which returns a collection object of type System.Text.RegularExpressions.MatchCollection.

USE:

Regex2OUTPUT:Regex3

Is Match activity has the same properties as the Matches activity, except the return value which is a Boolean variable indicating whether the specified regular expression finds a match in the specified input string, using the specified matching options, or not.
Replace activity has all the properties of the above two activities plus one more property for the replacement string. It returns a string value with all the matching substrings replaced by the replacement string.

Regex4

In all the above activities, one of the common properties is RegexOption.

Regex6

This property makes use of RegexOptions Enumeration of the namespace System.Text.RegularExpressions, which provides enumerated values to use to set regular expression options. This enumeration has a FlagsAttribute attribute that allows a bitwise combination of its member values, which is what RegexOption property is–a bitwise combination of enumeration values that specify options for matching. You can get more information about the enumeration, its members and what they are for, here: https://msdn.microsoft.com/en-us/library/hh454386.aspx

Custom Activities

As mentioned earlier, you can create your own UiPath activities to integrate regex in UiPath using .net methods of the Regex class, like Escape(string), IsMatch(string), Matches(string), Match(string) etc. You can find more of such .net methods here: https://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex_methods(v=vs.110).aspx

Creating a Regex Custom Activity

Here is an example of integrating regex in UiPath by means of a custom activity “Match regex”. It makes use of the .net method System.Text.RegularExpressions.Regex. Match(string, string) which when used, gives the very first occurrence of the entered regular expression pattern in the specified input string. Regex8Regex7

 

Regex9

.Net Methods

You can use the .net methods of Regex class or member methods of a Regex object directly in the workflow, instead of using activities for the same. There are only three predefined activities (discussed in detail above) which makes use of regex, while there are many more regex .net methods that you can make use of.

However, using a predefined activity OR creating a custom activity would facilitate code reusability.

In CONCLUSION, it can be said that Regex can be integrated with any of the RPA platforms, whether it is UiPath, Blue Prism, Automation Anywhere etc. The primary requirement is that the RPA platform language supports regex. Secondly, if coding is allowed within the application, regex can be included, whether or not a regex activity (like in UiPath) is present.

The following two tabs change content below.
Siddharth Ghosh

Siddharth Ghosh

Currently working as an RPA developer, Siddharth strives for versatility, having some experience with the UI technologies like AngularJS and Bootstrap, and a decent knowledge of Regex. He's an ardent aficionado of filmmaking as an art form, short-story writing, and tennis.
Siddharth Ghosh

Latest posts by Siddharth Ghosh (see all)

Siddharth Ghosh
Siddharth Ghosh

Currently working as an RPA developer, Siddharth strives for versatility, having some experience with the UI technologies like AngularJS and Bootstrap, and a decent knowledge of Regex. He's an ardent aficionado of filmmaking as an art form, short-story writing, and tennis.

All stories by: Siddharth Ghosh