Combining data

Analytics allows you to analyze data in only one table at a time. For this reason, you may have to combine data from two or more tables into one table before performing your analysis.

Analytics provides the following methods for combining data:

Combine records Combine fields
  • Append
  • Extract/Append
  • Merge
  • Join
  • Relate

The nature of the source data, or your analysis goal, dictates which method of combining data you should use. The five methods are described briefly below.

Append

When you append tables, you combine records from two or more tables into a new table that contains all the records from the appended tables. You have the option of including all the fields from the appended tables, or only the common fields.

Example

Scenario

You want to perform analysis on an entire year's worth of data but the data is spread among twelve monthly transaction tables.

Approach

You append the data from the twelve monthly tables into a single annual table containing all the data, and then perform the analysis.

Detailed information

For detailed information, see Appending tables.

Extract/Append

When you extract and append data, you extract records from one table and append them to the end of another table. Extracting is the same as copying, and appending is the same as adding.

You also have the option of extracting a subset of the fields in a record rather than the entire record.

The table you append to (the target table) is increased in size. A new table is not created.

Example

Scenario

You want to perform analysis on an entire set of employee records but records for new employees are not yet contained in the Employee master table.

Approach

You extract the records for new employees and append them to the end of the Employee master table, and then perform the analysis.

Detailed information

For detailed information, see Extracting and appending data.

Merge

When you merge tables, you interfile records from two sorted tables into a new, third table, which is also sorted. Interfiling means to combine records in accordance with their existing sort order.

Example

Scenario

You want to perform analysis on an entire set of employee records but the records are split between two divisional Employee tables.

Both tables are sorted by last name and you want to avoid the overhead of resorting the records once they are combined.

Approach

You merge the records from the two tables into a new third table. Merging preserves the sorting by last name.

Detailed information

For detailed information, see Merging tables.

Join

When you join tables, you use a common key field to incorporate records, or a selection of fields, from two tables into a new, third table. A common key field is an identifying field, such as Employee ID, that appears in both of the tables being joined.

Example

Scenario

You want to identify any vendors who are also employees as one way of analyzing data for possible improper payments.

Approach

You join the Vendor master table with the Employee table, using the common key field of Address.

The joined output table contains any vendors and employees with the same address.

Detailed information

For detailed information, see Joining tables.

Relate

When you relate tables, you virtually join up to 18 tables. You use a common key field to relate each table pair.

Relating, or virtually joining, means to create a temporary programmatic association between tables that allows you to access the data in the tables as if it existed in a single physical table. However, no physical table is created, and you can unrelate the source tables at any point.

A common key field is an identifying field, such as Employee ID, that appears in each table pair being related. Typically, you use a different common key field for each table pair.

Example

Scenario

You want to create a sales report that contains details about the customers, and the products sold, for the month of March, but the data is spread across three tables.

Approach

You relate the Customer master table with the Orders table, and the Orders table with the Product master table, to create a temporary association of tables that contains all the information you need for the report:

  • customer name and location from the Customer master table
  • order details from the Orders table
  • product details from the Product master table

Detailed information

For detailed information, see Relating tables.

Which data combining method should I use?

There can be more than one consideration when it comes to selecting a data combining method. You can use the guidelines below as a starting point.

Use... If...
Append
  • You want to combine multiple tables using the least amount of labor.
  • The records in the source tables are similar or exactly identical in structure.
Extract/Append
  • The records or fields in the two source tables are exactly identical in structure.
Merge
  • The records in the two source tables are exactly identical in structure.
  • Both source tables are sorted, using an identical sort order.

Tip

Merging can be tricky to perform correctly. You can get the same result by appending, or by extracting and appending, and then sorting.

If the two source tables are already sorted, merging is more efficient and can execute more quickly.

Join
  • The records in the two source tables have different record structures.
  • You want to include or exclude records based on matched or unmatched values in a common key field.
  • You are doing investigative analysis that requires a physical, joined table.
Relate
  • You want to relate, or virtually join, up to 18 tables with different record structures.
  • You want to include or exclude records based on matched or unmatched values in common key fields.
  • You do not need to output the combined data to a new table.
  • You are doing informational work, like reporting, that requires only a temporary association between tables.

Tip

If required, after relating tables you can perform a separate operation and extract any combination of fields from the related tables to a new physical table.

Data structure

When you combine data, the method you choose is often dependent on how the source data is structured. Data structure, or record structure, refers to the data elements contained in a record, their data type, the length of fields, and the number and order of columns.

For detailed information about data structure, see Data structure and data format requirements.

You may need to experiment

In some situations, it may not be immediately obvious which method to use for combining data. You may need to experiment with a small subset of the data to determine which method is best suited to the task you want to perform.

Using a small subset allows you to avoid longer processing times associated with larger tables, and may also make it easier to see patterns.

Using more than one method of combining data to achieve your goal

You may be able to address more complex data combining situations, involving multiple tables, by first using one method of combining data, and then using a second method with the output results of the first method.

Example

  1. You first compile an annual transaction table by combining monthly transaction tables.
  2. You then use a common key field such as Customer ID to join the annual transaction table with a master table containing data such as Customer Name.