Cleaning and filtering data at the same time

The two previous tutorials showed you how to use functions to filter data and to clean data. Now we'll look at how you can nest functions to perform both tasks simultaneously.

Nested functions

You can nest one function inside another function to achieve results that you couldn't achieve with either function alone.

Basic structure

Here is the basic structure of a nested function with one level of nesting:

FUNCTION_2( FUNCTION_1(function_1 input) , function_2 input)

You can see that FUNCTION_1( ) is completely contained inside FUNCTION_2( ).

Order of evaluation

Nested functions are evaluated starting with the innermost function and working outward to the outermost function. So, in the generic example above:

  1. FUNCTION_1(function_1 input) is evaluated first.
  2. The output of FUNCTION_1( ) becomes one of the inputs for FUNCTION_2( ).
  3. FUNCTION_2( ) is evaluated second.

Think about inputs and outputs

Generally speaking, you can nest any Analytics function inside another function, and if required build multiple levels of nesting.

However, keep in mind that the output of a function must meet the input requirements of the function containing it. For example, if a function requires a date input, the function that it contains must output a value with a datetime data type.

Key point

Nesting functions is a powerful and flexible capability that can allow you to achieve a great range of useful results. You can perform multiple transformations of source data simultaneously in preparation for inputting the data to a command.

Standardize case, and filter by multiple values

In a previous tutorial we used the MATCH( ) function to filter by multiple values. MATCH( ) is case sensitive, so if the case of the input values varies, the filter produces inaccurate results.

You can produce accurate results by nesting the UPPER( ) function inside the MATCH( ) function.

Example

You want to use the Vendor_City field to filter records in a table, but the city names have been entered inconsistently. Some have an initial capital ("Austin"), and some are entirely uppercase ("AUSTIN").

You can nest the UPPER( ) function inside the MATCH( ) function to:

  1. transform all values in the Vendor_City field to uppercase
  2. filter the records by city

    Note that you have to adjust your filter terms to be all uppercase so that they match the uppercase values output by the UPPER( ) function.

    MATCH( UPPER(Vendor_City) , "AUSTIN", "CHICAGO")

The table below illustrates the difference between using the MATCH( ) function alone, and using the nested function.

With MATCH( ) alone, the filter is too restrictive and excludes records that should be included.

Returned by: Returned by:
MATCH(Vendor_City, "Austin", "Chicago") MATCH( UPPER(Vendor_City), "AUSTIN", "CHICAGO")
Austin Austin
Chicago Chicago
  AUSTIN
  CHICAGO

Tip

Instead of using a nested function, you could add variations to the filter terms:  MATCH(Vendor_City, "Austin", "AUSTIN", "Chicago", "CHICAGO"). However, with additional filter terms this approach quickly becomes labor-intensive, and would fail to capture values with typos, such as "AUstin". Nesting UPPER( ) is the better approach.

Note

To apply the MATCH( ) or UPPER( ) functions to the Vendor_City field, you create a computed field that uses the function. Computed fields are discussed in a subsequent tutorial.

Standardize case, remove leading spaces, and filter by multiple values

You aren't limited to just one level of nesting. You can create multiple levels of nesting, based on your requirements.

Keep in mind:

  • Nested functions are evaluate from the innermost function to the outermost function.
  • The output of a function must meet the input requirements for the function containing it.

Example

In a second situation, the data in the Vendor_City field is even less consistent. Not only is case inconsistent, but some values are preceded by one or more blank spaces and some are not.

You can nest the UPPER( ) function inside the ALLTRIM( ) function, and the ALLTRIM( ) function inside the MATCH( ) function to:

  1. transform all values in the Vendor_City field to uppercase
  2. remove all leading blank spaces
  3. filter the records by city

    MATCH( ALLTRIM( UPPER(Vendor_City) ), "AUSTIN", "CHICAGO")

    Tip

    It's easy to lose track of opening and closing parentheses when building nested functions. Missing or unmatched parentheses are a common cause of function errors.

    The number of opening parentheses ( must always equal the number of closing parentheses ). In the example above, there are three opening parentheses and three closing parentheses.

The table below illustrates the difference between using the MATCH( ) function alone, and using the nested function.

With MATCH( ) alone, the filter is too restrictive and excludes records that should be included.

Returned by: Returned by:
MATCH(Vendor_City, "Austin", "Chicago") MATCH(ALLTRIM( UPPER(Vendor_City) ), "AUSTIN", "CHICAGO")
Austin Austin
Chicago Chicago
  AUSTIN
  CHICAGO
  [ ] Austin
  [ ] [ ] [ ] Chicago
  [ ] [ ] AUSTIN
  [ ] CHICAGO
  [ ] = blank space

Where to next?

If you have completed all the tutorials in How to use functions, you are ready to move on to Advanced use of functions.

The advanced tutorials teach you how to use functions with core Analytics features.