CVSSAMPLE command

Draws a sample of records using the classical variables sampling method.

Syntax

CVSSAMPLE ON book_value_field NUMSTRATA number <SEED seed_value> CUTOFF value <BCUTOFF value> STRATA boundary_value <,...n> SAMPLESIZE number <,...n> POPULATION stratum_count,stratum_value <,...n> <IF test> TO table_name

Parameters

Note

If you are using the output results of the CVSPREPARE command as input for the CVSSAMPLE command, a number of the parameter values are already specified and stored in variables. For more information, see CVSPREPARE command.

Do not include thousands separators, or percentage signs, when you specify values.

Name Description
ON book_value_field The numeric book value field to use as the basis for the sample.
NUMSTRATA number The number of strata to use for stratifying the book_value_field.
SEED seed_value

optional

The seed value to use to initialize the random number generator in Analytics.

If you omit SEED, Analytics randomly selects the seed value.

CUTOFF value

A top certainty stratum cutoff value.

Amounts in the book_value_field greater than or equal to the cutoff value are automatically selected and included in the sample.

BCUTOFF value

optional

A bottom certainty stratum cutoff value.

Amounts in the book_value_field less than or equal to the cutoff value are automatically selected and included in the sample.

STRATA boundary_value <,...n> The upper boundary values to use for stratifying the book_value_field.
SAMPLESIZE number <,...n> The number of records to sample from each stratum.
POPULATION stratum_count, stratum_value <,...n> The number of records in each stratum, and the total value for each stratum.
IF test

optional

A conditional expression that must be true in order to process each record. The command is executed on only those records that satisfy the condition.

Caution

If you specify a conditional expression, an identical conditional expression must be used during both the calculation of the sample size, and the drawing of the sample.

If you use a condition at one stage and not the other, or if the two conditions are not identical, the sampling results will probably not be statistically valid.

TO table_name

The location to send the results of the command to:

  • table_name saves the results to an Analytics table

    Specify table_name as a quoted string with a .FIL file extension. For example: TO "Output.FIL"

    By default, the table data file (.FIL) is saved to the folder containing the Analytics project.

    Use either an absolute or relative file path to save the data file to a different, existing folder:

    • TO "C:\Output.FIL"
    • TO "Results\Output.FIL"

    Note

    Table names are limited to 64 alphanumeric characters, not including the .FIL extension. The name can include the underscore character ( _ ), but no other special characters, or any spaces. The name cannot start with a number.

Analytics output variables

Name Contains
S_TOPEV

The top certainty stratum cutoff value specified by the user, or if none was specified, the upper boundary of the top stratum previously calculated by the CVSPREPARE command.

Also stores the count of the number of records in the top certainty stratum, and their total monetary value.

SBOTTOMEV

The bottom certainty stratum cutoff value specified by the user, or if none was specified, the lower boundary of the bottom stratum previously calculated by the CVSPREPARE command.

Also stores the count of the number of records in the bottom certainty stratum, and their total monetary value.

SBOUNDARYEV All strata upper boundaries prefilled by the command, or specified by the user. Does not include top or bottom certainty strata.
SPOPULATION The count of the number of records in each stratum, and the total monetary value for each stratum. Does not include top or bottom certainty strata.

Examples

Draw a classical variables sample

You are going to use classical variables sampling to estimate the total amount of monetary misstatement in an account containing invoices.

After stratifying the population, and calculating a statistically valid sample size for each stratum, you are ready to draw the sample.

The example below draws a stratified sample of records based on the invoice_amount field, and outputs the sampled records to the Invoices_sample table:

CVSSAMPLE ON invoice_amount NUMSTRATA 5 SEED 12345 CUTOFF 35000.00 STRATA 4376.88,9248.74,16904.52,23864.32,35000.00 SAMPLESIZE 37,36,49,36,39 POPULATION 1279,3382131.93,898,5693215.11,763,9987014.57,627,12657163.59,479,13346354.63 TO "Invoices_sample"

Remarks

For more information about how this command works, see Performing classical variables sampling.

System-generated fields

Analytics automatically generates four fields and adds them to the sample output table. For each record included in the sample, the fields contain the following descriptive information:

  • STRATUM the number of the stratum to which the record is allocated
  • ORIGIN_RECORD_NUMBER the original record number in the source data table
  • SELECTION_ORDER on a per-stratum basis, the order in which the record was randomly selected
  • SAMPLE_RECORD_NUMBER the record number in the sample output table