REGEXREPLACE( ) function

Replaces all instances of strings matching a regular expression with a new string.

Syntax

REGEXREPLACE(string, pattern, new_string)

Parameters

string

Character. The field, expression, or literal value to test for a matching pattern.

pattern

Character. The pattern (regular expression) to search for. The pattern parameter can contain literal characters, metacharacters, or a combination of the two. Literal characters include all alphanumeric characters, some punctuation characters, and blanks. For a list of metacharacters, see REGEXFIND( ) function. The pattern parameter must be enclosed in quotation marks. The search is case-sensitive, which means that uppercase and lowercase alpha characters must be explicitly specified.

new_string

Character. The string to use to replace all values matching pattern. The replacement string can contain literal characters, groups of characters from the original string (using the $int element), or a combination of the two. The new_string parameter must be enclosed in quotation marks.

Output

Character.

Remarks

The REGEXREPLACE( ) function allows you to construct a regular expression to find matching patterns in data, and to replace the matching values with a new string. For example:

REGEXREPLACE(character_field, "\s+", " ")

standardizes spacing in character data by replacing one or more spaces between text characters with a single space.

The search portion of the REGEXREPLACE( ) function is identical to the REGEXFIND( ) function. For detailed information about the search capability common to both functions, see REGEXFIND( ) function.

You can use the $int element to replace characters with themselves, which allows you to preserve the meaningful parts of data, while standardizing or omitting surrounding or intermixed data. Several examples using telephone numbers and names appear below. To use the $int element you must first create groups by using parentheses in the pattern parameter. For more information, see REGEXFIND( ) function.

You can avoid sequential character matching, and replace substrings regardless of their position in relation to one another, by nesting REGEXREPLACE( ) functions. For example:

Each instance of REGEXREPLACE( ) makes a pass through the target string, starting with the innermost instance.

When you use the REGEXREPLACE( ) function to create a computed field, the computed field length is identical to the original field length. If the replacement string length exceeds the target string length, overall string length increases, which results in truncation if the computed field cannot accommodate the increased string length. Characters that trail the target string are truncated first, followed by trailing replacement string characters. The examples below illustrate how truncation can result:

string

pattern

new_string

Field length

Result

x123x

123

A

5

xAx

x123x

123

ABC

5

xABCx

x123x

123

ABCD

5

xABCD

x123x

123

ABCDE

5

xABCD

x123x

123

ABCDE

6

xABCDE

x123x

123

ABCDE

7

xABCDEx

To avoid truncation, use the SUBSTR( ) function to increase field length. For example:

REGEXREPLACE("x123x","123","ABCDE") = xABCD

but,

REGEXREPLACE(SUBSTR("x123x",1,10),"123","ABCDE") = xABCDEx

Numerous resources explaining regular expressions are available on the Internet. ACL uses the ECMAScript implementation of regular expressions. Most implementations of regular expressions use a common core syntax.

Examples

Example Return value

REGEXREPLACE("AB CD   EF", "\s+", " ")

AB CD EF

REGEXREPLACE(character_field, "\s+", " ")

Returns the character field data with the spacing between words standardized on a single space.

REGEXREPLACE(character_field, "\s+", BLANKS(1))

Returns the character field data with the spacing between words standardized on a single space.

Using the BLANKS( ) function in the new_string parameter, rather than a literal space, makes spaces easier to read and less likely to be overlooked.

REGEXREPLACE(SUBSTR("1234567890",1,14), "(\d{3})[\s-]*(\d{3})[\s-]*(\d{4})", "($1) $2-$3")

(123) 456-7890

The formatting of the telephone number ‘1234567890’ is standardized.

REGEXREPLACE(Telephone_Number, ".*(\d{3})[\s-\.\)]*(\d{3})[\s-\.]*(\d{4})", "($1) $2-$3")

Returns the numbers in the Telephone_Number field with their formatting standardized.

REGEXREPLACE("Tel num: 123-456-7890 (office)", "(.*)(\d{3}[\s-\)\.]*\d{3}[\s-\.]*\d{4})(.*)", "$2")

123-456-7890

The telephone number ‘123-456-7890’ is extracted from the surrounding text.

REGEXREPLACE(Comment, "(.*)(\d{3})[\s-\)\.]*(\d{3})[\s-\.]*(\d{4})(.*)", "($2) $3-$4")

Extracts telephone numbers from surrounding text in the Comment field and standardizes their formatting.

REGEXREPLACE(REGEXREPLACE(REGEXREPLACE("1ABC-123aa","\d","9"),"[a-z]","x"), "[A-Z]", "X")

9XXX-999xx

REGEXREPLACE(REGEXREPLACE(REGEXREPLACE(Invoice_Number,"\d","9"),"[a-z]","x"), "[A-Z]", "X")

Returns the format of all identifiers in the Invoice_Number field.

REGEXREPLACE("Smith, John David", "^(\w+),(\s\w+)(\s\w+)?(\s\w+)?", "$2$3$4 $1")

John David Smith

REGEXREPLACE(Full_Name, "^(\w+),(\s\w+)(\s\w+)?(\s\w+)?", "$2$3$4 $1")

Returns the names in the Full_Name field in their regular order: First (Middle) (Middle) Last

Note: Name data can present various complications, such as apostrophes in names. Accounting for variations in name data typically requires more complex regular expressions than the one provided in this example.

Related reference
REGEXFIND( ) function
SUBSTR( ) function


(C) 2015 ACL Services Ltd. All Rights Reserved.