You can use text functions to manipulate text strings in Excel. However, you can't use them with regular expressions. As of writing this article VBA is your only option. In this guide, we’re going to show you how to use regular expressions in Excel.
What is a regular expression?
A regular expression (also known as regex or regexp shortly) is a special text string for specifying a search pattern. They are like wildcards. Instead of specifying the number of characters, you can create patterns to find a specific group of characters, like searching between "b" to "o", using OR logic, excluding some characters, or repeating values.
Regular expressions are commonly used for text parsing and replacing operations for all programming languages. To use regular expressions in Excel, we will be using VBA.
Pattern | Description | Samples |
^jack | begins with "jack" | jack-of-all-trades, jack's house |
jack$ | ends with "jack" | hijack |
^jack$ | is exactly "jack" | jack |
colo[u]{0,}r | can include "u" at least 0 times | colour, color (not colur) |
col[o|u]r | includes either "o" or "u" | color, colur (not colour) |
col[^u]r | accepts any character except "u" | color (not colur or colour) |
How to use regular expressions
Let's start using regular expressions in Excel by opening VBA. Press Alt + F11 keys to open VBA (Visual Basic for Applications) window. Add a module to enter your code.
Next step is to add regular expression reference to VBA. Click Tools > References in the VBA toolbar. Find and check Microsoft VBScript Regular Expressions 5.5 item in the References window. Click OK to add the reference.
Using the VBScript reference, we can create a regular expression object, which is defined as RegExp in VBA. A RegExp object has 4 properties and 3 methods:
Properties
Name | Type | Description |
Global | Boolean | Set True to find all cases that match with the pattern. Set False to find the first match. |
IgnoreCase | Boolean | Set True to not make case-sensitive search. Set False to make case-sensitive search. |
Multiline | Boolean | Set True if your string has multiple lines and you want to perform the search in all lines. |
Pattern | String | The regular expression pattern you want to search. |
Methods
Name | Arguments | Description |
Execute | sourceString As String | Returns an array that contains all occurrences of the pattern matched in the string. |
Replace | sourceString As String replaceVar As Variant | Returns a string which all occurrences of the pattern in the string are replaced with the replaceVar string. |
Test | sourceString As String | Returns True if there is a match. Otherwise, False. |
Code Samples
A function that returns TRUE/FALSE if the pattern is found in a string
Public Function RegExFind(str As String, pat As String) As Boolean
'Define the regular expression object
Dim RegEx As New RegExp
'Set up regular expression properties
With RegEx
.Global = False 'All occurences are not necessary since a single occurence is enough
.IgnoreCase = True 'No case-sensitivty
.MultiLine = True 'Check all lines
.Pattern = pat 'pattern
End With
RegExFind = RegEx.Test(str)
End Function
After writing the code, you can use this function as a regular Excel function.
A function that replaces the pattern with a given string
Public Function RegExReplace(str As String, pat As String, replaceStr As String) As String
'Define the regular expression object
Dim RegEx As New RegExp
'Set up regular expression properties
With RegEx
.Global = False 'All occurences are not necessary since a single occurence is enough
.IgnoreCase = True 'No case-sensitivty
.MultiLine = True 'Check all lines
.Pattern = pat 'pattern
End With
RegExReplace = RegEx.Replace(str, replaceStr) 'Return the modified string with replacement value
End Function
The following sample shows how to replace strings that start with "col", continue with 0 or 1 occurrences of "o" and single "u", and finally ends with an "r" character with "Color" string.