SAMPLE command
Concept Information
Draws a sample of records using either the record sampling or monetary unit sampling method.
Syntax
Note
The syntax does not include filtering (IF statements) or scope parameters because applying these options compromises the validity of a sample.
Fixed interval selection method
SAMPLE <ON> RECORD INTERVAL interval_value <FIXED initial_value> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND>
Cell selection method
SAMPLE <ON> RECORD CELL INTERVAL interval_value <RANDOM seed_value> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND> <MERSENNE_TWISTER>
Random selection method
SAMPLE <ON> RECORD NUMBER sample_size <RANDOM seed_value> <ORDER> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND> <MERSENNE_TWISTER>
Parameters
Note
Do not include thousands separators when you specify values.
Name | Description |
---|---|
ON RECORD | Use record sampling. |
INTERVAL interval_value FIXED initial_value | CELL INTERVAL interval_value | NUMBER sample_size |
INTERVAL interval_value FIXED initial_value Use the fixed interval selection method. An initial record is selected and all subsequent selections are a fixed interval or distance apart – for example, every 20th record after the initial selection.
CELL INTERVAL interval_value Use the cell selection method. The data set is divided into multiple equal-sized cells or groups, and one record is randomly selected from each cell. The interval_value dictates the size of each cell. Specify the interval value that was generated by calculating the sample size. NUMBER sample_size Use the random selection method. All records are randomly selected from the entire data set. Specify the sample size that was generated by calculating the sample size. |
RANDOM seed_value optional |
Note Cell and random selection methods only. The seed value to use to initialize the random number generator in Analytics. If you specify a value of zero ('0'), or omit RANDOM, Analytics randomly selects the seed value. |
ORDER optional |
Note Random selection method only. You can only use ORDER when specify FIELDS. Adds the ORDER field to the output results. This field displays the order in which each record is randomly selected. |
RECORD | FIELDS field_name <...n> | FIELDS ALL |
|
EXCLUDE field_name optional |
Only valid when sampling using FIELDS ALL. The field or fields to exclude from the command. EXCLUDE allows you to fine-tune FIELDS ALL, by excluding the specified fields. EXCLUDE must immediately follow FIELDS ALL. For example: FIELDS ALL EXCLUDE field_1 field_2 |
TO table_name |
The location to send the results of the command to:
|
LOCAL optional |
Saves the output file in the same location as the Analytics project. Note Applicable only when running the command against a server table with an output file that is an Analytics table. The LOCAL parameter must immediately follow the TO parameter. |
OPEN optional |
Opens the table created by the command after the command executes. Only valid if the command creates an output table. |
APPEND optional |
Appends the command output to the end of an existing file instead of overwriting it. Note You must ensure that the structure of the command output and the existing file are identical:
Analytics appends output to an existing file regardless of its structure. If the structure of the output and the existing file do not match, jumbled, missing, or inaccurate data can result. |
MERSENNE_TWISTER optional |
Note Cell and random selection methods only. The random number generator in Analytics uses the Mersenne-Twister algorithm. If you omit MERSENNE_TWISTER, the default Analytics algorithm is used. Note You should only use the default Analytics algorithm if you require backward compatibility with Analytics scripts or sampling results created prior to Analytics version 12. |
Examples
Draw a record sample
You are going to use record sampling to estimate the rate of deviation from a prescribed control in an account containing invoices.
After calculating a statistically valid sample size, you are ready to draw the sample. You are going to use the random selection method.
The example below:
- Samples the open Analytics table
- Uses the random selection method with a seed value of 123456
- Specifies a sample size of 95 records
- Includes only specified fields in the output table
- Specifies that the random number generator in Analytics uses the Mersenne-Twister algorithm
SAMPLE ON RECORD RANDOM 123456 NUMBER 95 FIELDS RefNum CustNum Amount Date Type TO "Ar_record_sample" OPEN MERSENNE_TWISTER
Remarks
For more information about how this command works, see Performing record sampling.
Syntax
Note
The syntax does not include filtering (IF statements) or scope parameters because applying these options compromises the validity of a sample.
Fixed interval selection method
SAMPLE <ON> mus_numeric_field INTERVAL interval_value <FIXED initial_value> <CUTOFF top_stratum_cutoff_value> <SUBSAMPLE> <NOREPLACEMENT> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND>
Cell selection method
SAMPLE <ON> mus_numeric_field CELL INTERVAL interval_value <CUTOFF top_stratum_cutoff_value> <RANDOM seed_value> <SUBSAMPLE> <NOREPLACEMENT> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND> <MERSENNE_TWISTER>
Random selection method
SAMPLE <ON> mus_numeric_field NUMBER sample_size POPULATION absolute_value <RANDOM seed_value> <SUBSAMPLE> <NOREPLACEMENT> <ORDER> {RECORD|FIELDS field_name <...n>|FIELDS ALL <EXCLUDE field_name <...n>>} TO table_name <LOCAL> <OPEN> <APPEND> <MERSENNE_TWISTER>
Parameters
Note
Do not include thousands separators when you specify values.
Name | Description |
---|---|
ON mus_numeric_field |
Use monetary unit sampling (MUS). mus_numeric_field is the numeric field or expression to use as the basis for the sampling. |
INTERVAL interval_value FIXED initial_value | CELL INTERVAL interval_value | NUMBER sample_size POPULATION absolute_value |
INTERVAL interval_value FIXED initial_value Use the fixed interval selection method. An initial monetary unit is selected and all subsequent selections are a fixed interval or distance apart – for example, every 5000th monetary unit after the initial selection.
CELL INTERVAL interval_value Use the cell selection method. The data set is divided into multiple equal-sized cells or groups, and one monetary unit is randomly selected from each cell. The interval_value dictates the size of each cell. Specify the interval value that was generated by calculating the sample size. NUMBER sample_size POPULATION absolute_value Use the random selection method. All monetary units are randomly selected from the entire data set.
|
CUTOFF top_stratum_cutoff_value optional |
Note Fixed interval and cell selection methods only. A top stratum cutoff value. Amounts in the mus_numeric_field greater than or equal to the cutoff value are automatically selected and included in the sample. If you omit CUTOFF, a default cutoff value equal to the interval_value is used. |
RANDOM seed_value optional |
Note Cell and random selection methods only. The seed value to use to initialize the random number generator in Analytics. If you specify a value of zero ('0'), or omit RANDOM, Analytics randomly selects the seed value. |
SUBSAMPLE optional |
Note You can only use SUBSAMPLE when specify FIELDS. Adds the SUBSAMPLE field to the output results. If each amount in a sample field represents a total of several separate transactions, and you want to perform audit procedures on only one transaction from each sampled total amount, you can use the values in the SUBSAMPLE field to randomly select the individual transactions. For more information, see Performing monetary unit sampling. |
NOREPLACEMENT optional |
The same record is not selected more than once. As a result, the sample may contain fewer records than calculated by the SIZE command. If NOREPLACEMENT is omitted, or if you specify REPLACEMENT, records can be selected more than once. |
ORDER optional |
Note Random selection method only. You can only use ORDER when specify FIELDS. Adds the ORDER field to the output results. This field displays the order in which each record is randomly selected. |
RECORD | FIELDS field_name <...n> | FIELDS ALL |
|
EXCLUDE field_name optional |
Only valid when sampling using FIELDS ALL. The field or fields to exclude from the command. EXCLUDE allows you to fine-tune FIELDS ALL, by excluding the specified fields. EXCLUDE must immediately follow FIELDS ALL. For example: FIELDS ALL EXCLUDE field_1 field_2 |
TO table_name |
The location to send the results of the command to:
|
LOCAL optional |
Saves the output file in the same location as the Analytics project. Note Applicable only when running the command against a server table with an output file that is an Analytics table. The LOCAL parameter must immediately follow the TO parameter. |
OPEN optional |
Opens the table created by the command after the command executes. Only valid if the command creates an output table. |
APPEND optional |
Appends the command output to the end of an existing file instead of overwriting it. Note You must ensure that the structure of the command output and the existing file are identical:
Analytics appends output to an existing file regardless of its structure. If the structure of the output and the existing file do not match, jumbled, missing, or inaccurate data can result. |
MERSENNE_TWISTER optional |
Note Cell and random selection methods only. The random number generator in Analytics uses the Mersenne-Twister algorithm. If you omit MERSENNE_TWISTER, the default Analytics algorithm is used. Note You should only use the default Analytics algorithm if you require backward compatibility with Analytics scripts or sampling results created prior to Analytics version 12. |
Examples
Draw a monetary unit sample
You are going to use monetary unit sampling to estimate the total amount of monetary misstatement in an account containing invoices.
After calculating a statistically valid sample size, you are ready to draw the sample. You are going to use the fixed interval selection method.
The example below:
- Samples the open Analytics table based on a transaction amount field
- Uses the fixed interval selection method with an interval value of $6,283.33
- Specifies that the first record selected contains the 100,000th monetary unit (the number of cents in $1,000)
- Uses a top stratum cutoff of $5,000
- Includes the entire record in the output table
SAMPLE ON Amount INTERVAL 6283.33 FIXED 1000.00 CUTOFF 5000.00 RECORD TO "Ar_monetary_unit_sample" OPEN
Remarks
For more information about how this command works, see Performing monetary unit sampling.