Performing classical variables sampling
Concept Information
You can create a new table that contains a representative sample of the monetary data in the active table.
Classical variables sampling is appropriate if you are interested in either:
- the total audited value of a file
- the total amount of monetary misstatement in a file
Because Analytics is a read-only application, after you have drawn the sample of records you need to export the sample table to an application that allows data entry, such as Excel, so that you can add audit values.
Follow the correct process for classical variables sampling
Before drawing a sample of records, you must stratify the table containing the records, and calculate a statistically valid sample size for each stratum.
For more information, see Preparing a classical variables sample.
Draw the sample of records
Note
Do not include the thousands separator, or the percentage sign, when you specify values. These characters prevent the command from running, or cause errors.
- Open the table containing the book values that you intend to sample.
- Select Sampling > Classical Variables Sampling (CVS) > Sample.
Note
The menu option is disabled if a table is not open.
The CVS Sample dialog box opens. If you are using the output results of the CVS Prepare stage as input for the sampling stage, most of the fields are prefilled with the required values.
If a number of the prefilled values are missing, you can:
- rerun the CVSPREPARE command from the log to regenerate the values
- use the CVSSAMPLE command generated during the CVS Prepare stage, if you saved it
Note
If you use the saved CVSSAMPLE command, you need to add the name of the output table to the command, and a seed value (optional).
For more information, see CVSSAMPLE command.
- If you are not using prefilled values, or you want to adjust one or more values, enter or update the required values:
- Book Value
- Number of Strata
- Top certainty stratum cutoff
- Bottom certainty stratum cutoff
- Strata Boundaries
- Sample Size
- Population (count, value)
Note
The input values are explained in detail below.
Caution
Normally you should not change any of the prefilled values. Changing prefilled values can negate the statistical validity of the sampling process.
- Optional. Select Seed, and enter a number.
The Seed value is explained below.
- If you used a conditional expression during the CVS Prepare stage, make sure that the If text box contains an identical expression.
Note
An If condition specified during the CVS Prepare stage is automatically propagated to the CVS Sample stage.
If you use a conditional expression, an identical expression must be used during both stages to ensure that the sampling results are statistically valid.
- In the To text box, specify the name of the Analytics table that will contain the output results.
Tip
Use the book value table name and add the suffix _sample.
Note
Analytics table names are limited to 64 alphanumeric characters, not including the .FIL extension. The name can include the underscore character ( _ ), but no other special characters, or any spaces. The name cannot start with a number.
- Click OK.
The random sample of records is drawn from the book value table and saved to the output table you specified.
A summary of the sample process and output results is displayed on screen.
Included in the display is a prefilled version of the CVSEVALUATE command.
Note
The output results are explained in detail below.
Save the CVSEVALUATE command for later use (optional)
As a convenience, you can save the CVSEVALUATE command to use once you have performed the analysis of the sampled records.
- At the bottom of the CVS Sample display area, click the CVSEVALUATE link to load the command into the command line.
- Copy the entire command from the command line and save it in an Analytics script.
After you have analyzed the sampled data, and added any audit values, you can use either the CVS Evaluate dialog box, or the CVSEVALUATE command, to project the results of the analysis to the entire population.
Note
If you use the CVSEVALUATE command, you need to update the name of the audit value field, and possibly the evaluation type.
For more information, see CVSEVALUATE command.
CVS Sample dialog box inputs and results
The tables below provide detailed information about the input values in the CVS Sample dialog box, and the output results.
Main tab – input values
Input values – CVS Sample dialog box |
Description |
---|---|
Book Value | The field that contains the book values you are auditing. |
Number of Strata |
The number of strata (subgroups) to use for numerically stratifying the data set. If you specify a certainty stratum, it is not included in the Number of Strata. |
Top certainty stratum cutoff |
A top certainty stratum cutoff value. Amounts in the Book Value field greater than or equal to the cutoff value are automatically selected and included in the sample. |
Bottom certainty stratum cutoff |
A bottom certainty stratum cutoff value. Amounts in the Book Value field less than or equal to the cutoff value are automatically selected and included in the sample. |
Strata Boundaries |
The boundary values to use for stratifying the data set. |
Sample Size |
The number of records to sample from each stratum. |
Population (count, values) |
The number of records in each stratum, and the total value for each stratum. |
Seed |
Optional. Can be any number. This number is used to initialize the random number generator in Analytics. Tip Leave the seed blank if you want Analytics to randomly select a seed value. |
Output results
Output results – CVS Sample | Description |
---|---|
Seed |
The seed value used to initialize the Analytics random number generator. The seed value was either specified by you, or randomly selected by Analytics. |
Book Value Field | The book value field that you specified as an input. |
Selection Method |
RANDOM – the sample selection method used by Analytics. |
Stratum Number |
A sequentially incremented number assigned to each stratum. The certainty strata are also assigned numbers, although they are not displayed on this screen:
|
Strata Boundaries |
The upper boundaries of each stratum, and the bottom and top certainty stratum cutoff values. Book values are assigned to a stratum if they are:
Book values are assigned to the bottom certainty stratum if they are less than or equal to the cutoff value. Book values are assigned to the top certainty stratum if they are greater than or equal to the cutoff value. |
Population Items |
The count of the records in the source table, broken down by stratum, and including the certainty strata. |
Population Value | The total book value of the source table, broken down by stratum, and including the certainty strata. |
Sample Items |
The total number of records drawn in the sample, broken down by stratum. All records in the certainty strata are automatically drawn and included in the output table. |
Sample Value | The total value of the records drawn in the sample, broken down by stratum, and including the value of the certainty strata records. |
Associated CVSEVALUATE Command |
The command for performing the CVS Evaluate stage, prefilled with values from the CVS Prepare and CVS Sample stages. |
System-generated fields
Analytics automatically generates four fields and adds them to the sample output table. For each record included in the sample, the fields contain the following descriptive information:
- Stratum the number of the stratum to which the record is allocated
- Original record number the original record number in the source data table
- Selection order on a per-stratum basis, the order in which the record was randomly selected
- Sample record number the record number in the sample output table
Add the Audit_Value field and export the sample table
Because Analytics is a read-only application, you need to export the table of sampled records to an application that allows data entry, such as Excel. In the external application you can add any audit values that you identify in the process of analyzing the sampled data.
Once the audit values have been added, you import the table back to Analytics.
Before you export the table, you need to add a field that replicates the book value field. You can then edit this replicated field as required in the external application.
Add the Audit_Value field
- Close the book value table.
- Open the newly created table containing the sampled records.
For example: <book_value_table>_sample
- Copy and paste this command into the command line:
DEFINE FIELD AUDIT_VALUE COMPUTED book_value_field
If the command line is not visible, select Window > Command Line.
- Replace book_value_field with the actual name of the field in your table that contains the book values.
For example:
DEFINE FIELD AUDIT_VALUE COMPUTED invoice_amount
Note
Make sure to use the physical name of the field, not the display name, if the two names are different.
- Press Enter to create the new field.
Tip
If you want to confirm that the AUDIT_VALUE field was created, enter display in the command line and press Enter to see a list of the fields in the table.
Export the sample table to Excel
Export the sample table to Excel, or to a delimited file for use in another suitable external application.
When you perform the export:
- select the Fields option, and select all fields to export
-
specify a name for the exported table that will be easy to identify later, such as the sample table name with the added suffix _audited
For example: <book_value_table>_sample_audited
For more information, see Exporting data.