Performing classical variables sampling

Concept Information

CVSSAMPLE command

You can create a new table that contains a representative sample of the monetary data in the active table.

Classical variables sampling is appropriate if you are interested in either:

  • the total audited value of a file
  • the total amount of monetary misstatement in a file

Because Analytics is a read-only application, after you have drawn the sample of records you need to export the sample table to an application that allows data entry, such as Excel, so that you can add audit values.

Follow the correct process for classical variables sampling

Before drawing a sample of records, you must stratify the table containing the records, and calculate a statistically valid sample size for each stratum.

For more information, see Preparing a classical variables sample.

Draw the sample of records

Note

Do not include the thousands separator, or the percentage sign, when you specify values. These characters prevent the command from running, or cause errors.

  1. Open the table containing the book values that you intend to sample.
  2. Select Sampling > Classical Variables Sampling (CVS) > Sample.

    Note

    The menu option is disabled if a table is not open.

    The CVS Sample dialog box opens. If you are using the output results of the CVS Prepare stage as input for the sampling stage, most of the fields are prefilled with the required values.

    If a number of the prefilled values are missing, you can:

    • rerun the CVSPREPARE command from the log to regenerate the values
    • use the CVSSAMPLE command generated during the CVS Prepare stage, if you saved it

      Note

      If you use the saved CVSSAMPLE command, you need to add the name of the output table to the command, and a seed value (optional).

      For more information, see CVSSAMPLE command.

  3. If you are not using prefilled values, or you want to adjust one or more values, enter or update the required values:
    • Book Value
    • Number of Strata
    • Top certainty stratum cutoff
    • Bottom certainty stratum cutoff
    • Strata Boundaries
    • Sample Size
    • Population (count, value)

    Note

    The input values are explained in detail below.

    Caution

    Normally you should not change any of the prefilled values. Changing prefilled values can negate the statistical validity of the sampling process.

  4. Optional. Select Seed, and enter a number.

    The Seed value is explained below.

  5. If you used a conditional expression during the CVS Prepare stage, make sure that the If text box contains an identical expression.

    Note

    An If condition specified during the CVS Prepare stage is automatically propagated to the CVS Sample stage.

    If you use a conditional expression, an identical expression must be used during both stages to ensure that the sampling results are statistically valid.

  6. In the To text box, specify the name of the Analytics table that will contain the output results.

    Tip

    Use the book value table name and add the suffix _sample.

  7. Click OK.

    The random sample of records is drawn from the book value table and saved to the output table you specified.

    A summary of the sample process and output results is displayed on screen.

    Included in the display is a prefilled version of the CVSEVALUATE command.

    Note

    The output results are explained in detail below.

Save the CVSEVALUATE command for later use (optional)

As a convenience, you can save the CVSEVALUATE command to use once you have performed the analysis of the sampled records.

  1. At the bottom of the CVS Sample display area, click the CVSEVALUATE link to load the command into the command line.
  2. Copy the entire command from the command line and save it in an Analytics script.

    After you have analyzed the sampled data, and added any audit values, you can use either the CVS Evaluate dialog box, or the CVSEVALUATE command, to project the results of the analysis to the entire population.

    Note

    If you use the CVSEVALUATE command, you need to update the name of the audit value field, and possibly the evaluation type.

    For more information, see CVSEVALUATE command.

CVS Sample dialog box inputs and results

The tables below provide detailed information about the input values in the CVS Sample dialog box, and the output results.

Main tab – input values

Input values – CVS Sample dialog box

Description
Book Value The field that contains the book values you are auditing.
Number of Strata

The number of strata (subgroups) to use for numerically stratifying the data set.

If you specify a certainty stratum, it is not included in the Number of Strata.

Top certainty stratum cutoff

A top certainty stratum cutoff value.

Amounts in the Book Value field greater than or equal to the cutoff value are automatically selected and included in the sample.

Bottom certainty stratum cutoff

A bottom certainty stratum cutoff value.

Amounts in the Book Value field less than or equal to the cutoff value are automatically selected and included in the sample.

Strata Boundaries

The boundary values to use for stratifying the data set.

Sample Size

The number of records to sample from each stratum.

Population (count, values)

The number of records in each stratum, and the total value for each stratum.

Seed

Optional. Can be any number.

This number is used to initialize the random number generator in Analytics.

Tip

Leave the seed blank if you want Analytics to randomly select a seed value.

Output results

Output results – CVS Sample Description
Seed

The seed value used to initialize the Analytics random number generator.

The seed value was either specified by you, or randomly selected by Analytics.

Book Value Field The book value field that you specified as an input.
Selection Method

RANDOM – the sample selection method used by Analytics.

Stratum Number

A sequentially incremented number assigned to each stratum.

The certainty strata are also assigned numbers, although they are not displayed on this screen:

  • -1 bottom certainty stratum
  • 0 top certainty stratum
Strata Boundaries

The upper boundaries of each stratum, and the bottom and top certainty stratum cutoff values.

Book values are assigned to a stratum if they are:

  • less than or equal to the upper boundary
  • greater than the boundary immediately below

Book values are assigned to the bottom certainty stratum if they are less than or equal to the cutoff value.

Book values are assigned to the top certainty stratum if they are greater than or equal to the cutoff value.

Population Items

The count of the records in the source table, broken down by stratum, and including the certainty strata.

Population Value The total book value of the source table, broken down by stratum, and including the certainty strata.
Sample Items

The total number of records drawn in the sample, broken down by stratum.

All records in the certainty strata are automatically drawn and included in the output table.

Sample Value The total value of the records drawn in the sample, broken down by stratum, and including the value of the certainty strata records.
Associated CVSEVALUATE Command

The command for performing the CVS Evaluate stage, prefilled with values from the CVS Prepare and CVS Sample stages.

System-generated fields

Analytics automatically generates four fields and adds them to the sample output table. For each record included in the sample, the fields contain the following descriptive information:

  • Stratum the number of the stratum to which the record is allocated
  • Original record number the original record number in the source data table
  • Selection order on a per-stratum basis, the order in which the record was randomly selected
  • Sample record number the record number in the sample output table

Add the Audit_Value field and export the sample table

Because Analytics is a read-only application, you need to export the table of sampled records to an application that allows data entry, such as Excel. In the external application you can add any audit values that you identify in the process of analyzing the sampled data.

Once the audit values have been added, you import the table back to Analytics.

Before you export the table, you need to add a field that replicates the book value field. You can then edit this replicated field as required in the external application.

Add the Audit_Value field

  1. Close the book value table.
  2. Open the newly created table containing the sampled records.

    For example: <book_value_table>_sample

  3. Copy and paste this command into the command line:
    DEFINE FIELD AUDIT_VALUE COMPUTED book_value_field

    If the command line is not visible, select Window > Command Line.

  4. Replace book_value_field with the actual name of the field in your table that contains the book values.

    For example:

    DEFINE FIELD AUDIT_VALUE COMPUTED invoice_amount

    Note

    Make sure to use the physical name of the field, not the display name, if the two names are different.

  5. Press Enter to create the new field.

    Tip

    If you want to confirm that the AUDIT_VALUE field was created, enter display in the command line and press Enter to see a list of the fields in the table.

Export the sample table to Excel

Export the sample table to Excel, or to a delimited file for use in another suitable external application.

When you perform the export:

  • select the Fields option, and select all fields to export
  • specify a name for the exported table that will be easy to identify later, such as the sample table name with the added suffix _audited

    For example: <book_value_table>_sample_audited

For more information, see Exporting data.