SAMPLE command

Draws a sample of records using either the record sampling or monetary unit sampling method.

Syntax

Note

The syntax does not include filtering (IF statements) or scope parameters because applying these options compromises the validity of a sample.

Fixed interval selection method

SAMPLE <ON> RECORD INTERVAL interval_value <FIXED initial_value> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND> 

Cell selection method

SAMPLE <ON> RECORD CELL INTERVAL interval_value <RANDOM seed_value> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND> <MERSENNE_TWISTER> 

Random selection method

SAMPLE <ON> RECORD NUMBER sample_size <RANDOM seed_value> <ORDER> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND> <MERSENNE_TWISTER>

Parameters

Note

Do not include thousands separators when you specify values.

Name Description
ON RECORD Use record sampling.
INTERVAL interval_value FIXED initial_value | CELL INTERVAL interval_value | NUMBER sample_size

INTERVAL interval_value FIXED initial_value

Use the fixed interval selection method.

An initial record is selected and all subsequent selections are a fixed interval or distance apart – for example, every 20th record after the initial selection.

  • INTERVAL interval_value specify the interval value that was generated by calculating the sample size
  • FIXED initial_value specify the initial record number selected

    If you specify an initial_value of zero ('0'), or omit FIXED, Analytics randomly selects the initial record.

CELL INTERVAL interval_value

Use the cell selection method.

The data set is divided into multiple equal-sized cells or groups, and one record is randomly selected from each cell.

The interval_value dictates the size of each cell. Specify the interval value that was generated by calculating the sample size.

NUMBER sample_size

Use the random selection method.

All records are randomly selected from the entire data set.

Specify the sample size that was generated by calculating the sample size.

RANDOM seed_value

optional

Note

Cell and random selection methods only.

The seed value to use to initialize the random number generator in Analytics.

If you specify a value of zero ('0'), or omit RANDOM, Analytics randomly selects the seed value.

ORDER

optional

Note

Random selection method only.

You can only use ORDER when specify FIELDS.

Adds the ORDER field to the output results.

This field displays the order in which each record is randomly selected.

RECORD | FIELDS field_name <...n> | FIELDS ALL
  • RECORD the entire record is included in the output table

    Fields are included in the order that they appear in the source table layout.

  • FIELDS field_name individual fields, rather than the entire record, are included in the output table

    Specify the field(s) or expressions to include. If you specify multiple fields, they must be separated by spaces.

    Fields are included in the order that you list them.

  • FIELDS ALL all fields are included in the output table

    Fields are included in the order that they appear in the source table layout.

EXCLUDE field_name

optional

Only valid when sampling using FIELDS ALL.

The field or fields to exclude from the command. EXCLUDE allows you to fine-tune FIELDS ALL, by excluding the specified fields.

EXCLUDE must immediately follow FIELDS ALL. For example:

FIELDS ALL EXCLUDE field_1 field_2 
TO table_name

The location to send the results of the command to:

  • table_name saves the results to an Analytics table

    Specify table_name as a quoted string with a .FIL file extension. For example: TO "Output.FIL"

    By default, the table data file (.FIL) is saved to the folder containing the Analytics project.

    Use either an absolute or relative file path to save the data file to a different, existing folder:

    • TO "C:\Output.FIL"
    • TO "Results\Output.FIL"

    Note

    Table names are limited to 64 alphanumeric characters, not including the .FIL extension. The name can include the underscore character ( _ ), but no other special characters, or any spaces. The name cannot start with a number.

LOCAL

optional

Saves the output file in the same location as the Analytics project.

Note

Applicable only when running the command against a server table with an output file that is an Analytics table.

The LOCAL parameter must immediately follow the TO parameter.

OPEN

optional

Opens the table created by the command after the command executes. Only valid if the command creates an output table.

APPEND

optional

Appends the command output to the end of an existing file instead of overwriting it.

Note

You must ensure that the structure of the command output and the existing file are identical:

  • the same fields
  • the same field order
  • matching fields are the same length
  • matching fields are the same data type

Analytics appends output to an existing file regardless of its structure. If the structure of the output and the existing file do not match, jumbled, missing, or inaccurate data can result.

MERSENNE_TWISTER

optional

Note

Cell and random selection methods only.

The random number generator in Analytics uses the Mersenne-Twister algorithm.

If you omit MERSENNE_TWISTER, the default Analytics algorithm is used.

Note

You should only use the default Analytics algorithm if you require backward compatibility with Analytics scripts or sampling results created prior to Analytics version 12.

Examples

Draw a record sample

You are going to use record sampling to estimate the rate of deviation from a prescribed control in an account containing invoices.

After calculating a statistically valid sample size, you are ready to draw the sample. You are going to use the random selection method.

The example below:

  • Samples the open Analytics table
  • Uses the random selection method with a seed value of 123456
  • Specifies a sample size of 95 records
  • Includes only specified fields in the output table
  • Specifies that the random number generator in Analytics uses the Mersenne-Twister algorithm
SAMPLE ON RECORD RANDOM 123456 NUMBER 95 FIELDS RefNum CustNum Amount Date Type TO "Ar_record_sample" OPEN MERSENNE_TWISTER

Remarks

For more information about how this command works, see Performing record sampling.

Syntax

Note

The syntax does not include filtering (IF statements) or scope parameters because applying these options compromises the validity of a sample.

Fixed interval selection method

SAMPLE <ON> mus_numeric_field INTERVAL interval_value <FIXED initial_value> <CUTOFF top_stratum_cutoff_value> <SUBSAMPLE> <NOREPLACEMENT> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND> 

Cell selection method

SAMPLE <ON> mus_numeric_field CELL INTERVAL interval_value <CUTOFF top_stratum_cutoff_value> <RANDOM seed_value> <SUBSAMPLE> <NOREPLACEMENT> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND> <MERSENNE_TWISTER> 

Random selection method

SAMPLE <ON> mus_numeric_field NUMBER sample_size POPULATION absolute_value <RANDOM seed_value> <SUBSAMPLE> <NOREPLACEMENT> <ORDER> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND> <MERSENNE_TWISTER>

Parameters

Note

Do not include thousands separators when you specify values.

Name Description
ON mus_numeric_field

Use monetary unit sampling (MUS).

mus_numeric_field is the numeric field or expression to use as the basis for the sampling.

INTERVAL interval_value FIXED initial_value | CELL INTERVAL interval_value | NUMBER sample_size POPULATION absolute_value

INTERVAL interval_value FIXED initial_value

Use the fixed interval selection method.

An initial monetary unit is selected and all subsequent selections are a fixed interval or distance apart – for example, every 5000th monetary unit after the initial selection.

  • INTERVAL interval_value specify the interval value that was generated by calculating the sample size
  • FIXED initial_value specify the initial monetary unit selected

    If you specify an initial_value of zero ('0'), or omit FIXED, Analytics randomly selects the initial monetary unit.

CELL INTERVAL interval_value

Use the cell selection method.

The data set is divided into multiple equal-sized cells or groups, and one monetary unit is randomly selected from each cell.

The interval_value dictates the size of each cell. Specify the interval value that was generated by calculating the sample size.

NUMBER sample_size POPULATION absolute_value

Use the random selection method.

All monetary units are randomly selected from the entire data set.

  • NUMBER sample_size specify the sample size that was generated by calculating the sample size
  • POPULATION absolute_value specify the total absolute value of mus_numeric_field, which is the population from which the sample will be selected
CUTOFF top_stratum_cutoff_value

optional

Note

Fixed interval and cell selection methods only.

A top stratum cutoff value.

Amounts in the mus_numeric_field greater than or equal to the cutoff value are automatically selected and included in the sample.

If you omit CUTOFF, a default cutoff value equal to the interval_value is used.

RANDOM seed_value

optional

Note

Cell and random selection methods only.

The seed value to use to initialize the random number generator in Analytics.

If you specify a value of zero ('0'), or omit RANDOM, Analytics randomly selects the seed value.

SUBSAMPLE

optional

Note

You can only use SUBSAMPLE when specify FIELDS.

Adds the SUBSAMPLE field to the output results.

If each amount in a sample field represents a total of several separate transactions, and you want to perform audit procedures on only one transaction from each sampled total amount, you can use the values in the SUBSAMPLE field to randomly select the individual transactions.

For more information, see Performing monetary unit sampling.

NOREPLACEMENT

optional

The same record is not selected more than once. As a result, the sample may contain fewer records than calculated by the SIZE command.

If NOREPLACEMENT is omitted, or if you specify REPLACEMENT, records can be selected more than once.

ORDER

optional

Note

Random selection method only.

You can only use ORDER when specify FIELDS.

Adds the ORDER field to the output results.

This field displays the order in which each record is randomly selected.

RECORD | FIELDS field_name <...n> | FIELDS ALL
  • RECORD the entire record is included in the output table

    Fields are included in the order that they appear in the source table layout.

  • FIELDS field_name individual fields, rather than the entire record, are included in the output table

    Specify the field(s) or expressions to include. If you specify multiple fields, they must be separated by spaces.

    Fields are included in the order that you list them.

  • FIELDS ALL all fields are included in the output table

    Fields are included in the order that they appear in the source table layout.

EXCLUDE field_name

optional

Only valid when sampling using FIELDS ALL.

The field or fields to exclude from the command. EXCLUDE allows you to fine-tune FIELDS ALL, by excluding the specified fields.

EXCLUDE must immediately follow FIELDS ALL. For example:

FIELDS ALL EXCLUDE field_1 field_2 
TO table_name

The location to send the results of the command to:

  • table_name saves the results to an Analytics table

    Specify table_name as a quoted string with a .FIL file extension. For example: TO "Output.FIL"

    By default, the table data file (.FIL) is saved to the folder containing the Analytics project.

    Use either an absolute or relative file path to save the data file to a different, existing folder:

    • TO "C:\Output.FIL"
    • TO "Results\Output.FIL"

    Note

    Table names are limited to 64 alphanumeric characters, not including the .FIL extension. The name can include the underscore character ( _ ), but no other special characters, or any spaces. The name cannot start with a number.

LOCAL

optional

Saves the output file in the same location as the Analytics project.

Note

Applicable only when running the command against a server table with an output file that is an Analytics table.

The LOCAL parameter must immediately follow the TO parameter.

OPEN

optional

Opens the table created by the command after the command executes. Only valid if the command creates an output table.

APPEND

optional

Appends the command output to the end of an existing file instead of overwriting it.

Note

You must ensure that the structure of the command output and the existing file are identical:

  • the same fields
  • the same field order
  • matching fields are the same length
  • matching fields are the same data type

Analytics appends output to an existing file regardless of its structure. If the structure of the output and the existing file do not match, jumbled, missing, or inaccurate data can result.

MERSENNE_TWISTER

optional

Note

Cell and random selection methods only.

The random number generator in Analytics uses the Mersenne-Twister algorithm.

If you omit MERSENNE_TWISTER, the default Analytics algorithm is used.

Note

You should only use the default Analytics algorithm if you require backward compatibility with Analytics scripts or sampling results created prior to Analytics version 12.

Examples

Draw a monetary unit sample

You are going to use monetary unit sampling to estimate the total amount of monetary misstatement in an account containing invoices.

After calculating a statistically valid sample size, you are ready to draw the sample. You are going to use the fixed interval selection method.

The example below:

  • Samples the open Analytics table based on a transaction amount field
  • Uses the fixed interval selection method with an interval value of $6,283.33
  • Specifies that the first record selected contains the 100,000th monetary unit (the number of cents in $1,000)
  • Uses a top stratum cutoff of $5,000
  • Includes the entire record in the output table
SAMPLE ON Amount INTERVAL 6283.33 FIXED 1000.00 CUTOFF 5000.00 RECORD TO "Ar_monetary_unit_sample" OPEN

Remarks

For more information about how this command works, see Performing monetary unit sampling.