Summarizing data

Concept Information

SUMMARIZE command

Summarizing groups the records in a table based on identical values in one or more key fields, and counts the number of records in each group. You also have the option of performing various statistical calculations for each group.

If you summarize by more than one key field (nested summarizing), groups are based on identical combinations of values across the key fields.

Key fields can be character, numeric, or datetime.

Summarizing by one key field

Summarizing by one key field is the simplest form of summarizing.

For example, you could summarize a transactions table on the customer number field to find the total number of transactions for each customer.

In the example below, there are ten values in the Customer Number field in the input table. Some values are unique, and some values are identical. After summarizing, the values are grouped into four unique groups. The Count tells you how many records, or transactions, are in each customer number group.

Input table Output results
Key field: Customer Number Summarized group Count
795401 230575 1
518008 518008 5
518008 795401 3
925007 925007 1
518008    
795401    
518008    
230575    
795401    
518008    

Summarizing by multiple key fields

Summarizing by multiple key fields, or nested summarizing, lets you perform more detailed analysis of data.

For example, you could summarize a transactions table on the customer number and the transaction date fields to find the total number of transactions for each customer for each date that the customer had transactions.

In the example below, there are ten values in the Customer Number field in the input table, with accompanying dates in the Invoice Date field. Some combinations of customer number and date are unique, and some are identical. After summarizing, the customer number-date combinations are grouped into seven unique groups. The Count tells you how many records, or transactions, are in each group.

Input table Output results
Key field 1: Customer Number Key field 2: Invoice Date Nested summarized group Count
795401 08/20/2016 230575 06/13/2016 1
518008 10/15/2016 518008 04/30/2016 1
518008 07/17/2016 518008 07/17/2016 3
925007 05/21/2016 518008 10/15/2016 1
518008 04/30/2016 795401 06/30/2016 1
795401 08/20/2016 795401 08/20/2016 2
518008 07/17/2016 925007 05/21/2016 1
230575 06/13/2016      
795401 06/30/2016      
518008 07/17/2016      

Nested summarizing in detail

If you summarize by more than one key field, you create nested summarized groups in the output results.

Nesting hierarchy

The order in which you select the key fields dictates the nesting hierarchy. The records are summarized by the first field you select, and within each of these primary groupings the records are then summarized by the second field you select, and so on.

Note

Reversing the order in which you select two summarize key fields may give quite different results.

Field order in the output results

The order in which you select the key fields is also the order in which the columns appear in the output results. If you are appending results to an existing Analytics table, the column selection and order must be identical to the column selection and order in the existing table.

Summarizing and sorting

Summarizing can process either sorted or unsorted data. The Presort option allows you to include initial sorting of the data with summarizing.

If you use Presort

If you use Presort, the output results are sorted and contain a single, unique group for each set of identical values, or identical combination of values, in the key field or fields.

Tip

If the input table is already sorted, you can save processing time by deselecting the Presort option.

If you do not use Presort

If you do not use Presort, the output results use the sort order of the input table.

If the key field or fields contain non-sequential identical values, the output results contain more than one group for each set of identical values, or identical combination of values.

Note

Depending on the context, more than one group for each set of identical values, or identical combination of values, can defeat the purpose of summarizing.

Subtotaling numeric fields

When summarizing, you can optionally subtotal one or more numeric fields. In the examples above, you could subtotal the transaction amount field to calculate:

  • The total transaction amount for each customer

  • The total transaction amount for each customer for each date that the customer had transactions

The statistical options

You have the option of performing statistical calculations on any subtotal field you specify. The statistical calculations are broken down by group in the output results.

In the examples above, if you subtotal the transaction amount field, you could also use one of the statistical options to calculate:

  • the average, minimum, and maximum transaction amounts for each customer
  • the average, minimum, and maximum transaction amounts for each customer for each date that the customer had transactions

Subtotal and statistical options in detail

The table below provides details about the subtotal and statistical options and calculations.

The Other Fields option

The Other Fields option allows you to select additional character, numeric, or datetime fields to include in the output. This option can provide useful information if the fields you select contain the same value for all records in each summarized group.

For example, if you summarize a table on customer number, an appropriate “other field” could be customer name. The customer name should be identical for all records with the same customer number.

If you specify an “other field” that contains values that are different for a summarized group, only the value for the first record in the group is displayed, which is not meaningful.

For example, if you summarize a vendor table by state, and select city as an “other field”, only the first city listed for each state appears in the output. In this instance, the better approach is to summarize using both state and city as key fields, in that order.

Summarize results

The example below shows the results of summarizing an accounts receivable table on customer number and transaction type. The transaction amount is subtotaled, with some associated statistics. The results are output to screen.

The example uses a subset of customer numbers from the Ar table in ACL DATA\Sample Data Files\Sample Project.ACL.

Summarizing in detail

Summarizing performs the following operations:

Operation Location in Summarize results above
Groups the records based on identical values, or identical combinations of values, in one or more character, numeric, or datetime key fields Cust Number field

Trans Type field

Optionally subtotals the values of one or more numeric fields for each group Total Trans Amount field
Optionally performs statistical calculations on each subtotaled numeric field Average, Minimum, and Maximum Trans Amount fields

Note

Additional statistical calculations not shown

Optionally calculates the percentage of source table records belonging to each group Percent of Count field
Counts (subtotals) the number of records for each group Count field
Optionally displays additional character, numeric, or datetime fields with complementary information Name field

Provides totals for all numeric fields included in the output results

Note

The Totals row is only provided when you output the results to screen.

Totals row

Steps

You can summarize data by grouping the records in a table based on identical values, or identical combinations of values, in one or more character, numeric, or datetime fields.

You can optionally subtotal associated numeric fields. You can also perform statistical calculations on any subtotal field you specify. The results of the statistical calculations are broken down by group in the summarized output table.