Record sampling tutorial

This tutorial introduces you to the end-to-end process of record sampling in Analytics.

Estimated time 20 minutes

Summary You will draw a sample of records from a vouchers table, and identify control deviations in the sample. Based on the sample results, you will make a statistical estimate of the deviation rate for the entire table.

You then use the statistical estimate to judge whether the voucher control process is operating effectively.

Main tasks To perform record sampling correctly, you need to do four main tasks:

The tutorial leaves out optional aspects of record sampling, and focuses on a single path, so you can quickly get a basic understanding of how record sampling in Analytics works.

Tip

For simple definitions of sampling terms, see A word about terminology.

Record sampling scenario

Testing the voucher control process

The scenario

You are examining a vouchers table with over 5000 records. You want to pull hard copies of a sample of the vouchers to confirm that they match the system entries, and that the voucher control process is working effectively.

You will check the hard copies to confirm:

  • vouchers have been approved
  • the same person did not create and approve a voucher

How do you proceed?

How many hard copies should you pull? How do you decide which ones to pull? How do any control deviations you find in the sample relate to the entire population of vouchers?

You can use Analytics record sampling to get answers to these questions.

Analytics table used in the scenario

This scenario uses the Vouchers table in the ACL_Rockwood.acl sample data file included with Analytics.

Note

Do not include the thousands separator, or the percentage sign, when you specify values. These characters prevent the command from running, or cause errors.

1   Calculate a valid sample size

Note

In a production environment, the values you specify to calculate a valid sample size are dependent on your professional judgment.

  1. In ACL_Rockwood.acl, open the Vouchers table, located in the Acquisitions_Payment folder.
  2. Select Sampling > Record/Monetary Unit Sampling > Calculate Size.
  3. Select Record.
  4. Specify the input values exactly as they appear in the screen below and click Calculate to calculate the sample size.

    After reviewing the results, you can click OK to finalize the sample size calculation, or you can experiment with specifying different values (see below).

What the input values mean

Confidence

You want a 95% degree of confidence that the sample you are going to draw is representative of the entire population.

Put another way: if you drew the sample 100 times, it would be representative 95 times, and unrepresentative only 5 times.

Population

The total number of records in the Vouchers table.

Upper Error Limit (%)

A maximum of 2% of the vouchers can lack proper approval and you still consider the control effective.

Expected Error Rate (%) You expect that 1% of the vouchers lack proper approval.

What the results mean

Sample Size You should pull 593 hard copy vouchers.
Interval

If you use one of the interval methods of sample selection, the records selected are either:

  • every 9th record
  • a randomly selected record from each block of 9 records

Note

8.93 is rounded up to 9.

Number of Tolerable Errors

Note

The record sampling tutorial does not use this number, which provides an alternative method for evaluating control deviations.

What the number means:

  • As you examine hard copy vouchers in the sample, if more than 6 vouchers violate the control you can consider the control ineffective.

Learn more: experiment by specifying different values

Change a value in any of the following fields in the Size dialog box, click Calculate, and notice how the results change. Change only one value at a time so it is easier to see how the change affects the results.

  • Confidence
  • Upper Error Limit (%)
  • Expected Error Rate (%)

More stringent requirements increase the sample size. More lenient requirements decrease the sample size.

Reset the values to match the screen above and click OK. Pin the Size tab with the results of the sample size calculation.

2   Draw the sample of records

  1. Return to the Vouchers table.
  2. Select Sampling > Record/Monetary Unit Sampling > Sample.
  3. Select Record.
  4. Specify the input values exactly as they appear in the screen below and click OK to draw the sample of records.

What the input values mean

Cell

You are using the cell selection method for drawing the sample of records.

With the cell selection method, each selected record is randomly chosen from within identically sized cells or blocks of records.

Interval

The size of each cell is 9 records. 8.93 is rounded up to 9.

Seed

A seed value of 456654 is used to initialize the random number generator in Analytics. You can specify any seed value you want.

The random number generator specifies which record number is selected from each cell.

Algorithm The random number generator uses the Mersenne-Twister algorithm to generate random numbers.
To

The sample of records drawn from the Vouchers table is output to a new table called Vouchers_sample.

Note

Because Analytics rounds up the Interval to 9, the actual number of records drawn is slightly less than the calculated sample size of 593.

3   Perform your analysis on the sample

For the purposes of the tutorial, assume that you do the following:

  1. Pull the hard copies for the voucher numbers that appear in the Vouchers_sample table.
  2. Examine each voucher and record any vouchers that lack proper approval.

4   Project the analysis results to the entire population

  1. Select Sampling > Record/Monetary Unit Sampling > Evaluate.

    Note

    The menu option is disabled if a table is not open.

  2. Select Record.
  3. Specify the input values exactly as they appear in the screen below and click OK to project the results.

What the input values mean

Confidence

The same degree of confidence you specified when you calculated sample size.

Sample Size

The actual number of records in the sample you drew – that is, the number of records in the Vouchers_sample table.

Number of Errors

When you examined the hard copies, the number of vouchers that lacked proper approval.

What the projected result means

Upper error limit frequency

The maximum deviation rate for the entire population of vouchers, projected with a 95% degree of confidence.

Put another way: There is a 95% probability that the number of vouchers that lack proper approval in the Vouchers table does not exceed 1.79%, or 95 vouchers.

Because 1.79% is less than the 2.00% you specified for the Upper Error Limit (%) when you calculated the sample size, you can conclude that the voucher control is operating effectively.

For a detailed explanation, see What “Upper error limit frequency” tells you.

Learn more: experiment by specifying different values

Rerun the evaluate command with a different Number of Errors to see how the result changes.

The table below summarizes some different results.

Number of Errors

(in the sample)

Upper error limit frequency

(projected maximum)

Vouchers lacking approval in Vouchers table

(projected maximum)

Conclusion
5 1.79% 95

(0.0179 x 5298)

The voucher control is operating effectively.

1.79% < the upper error limit of 2.00%

6 2.02% 107

(0.0202 x 5298)

In strict terms, the voucher control is not operating effectively.

However, 2.02% is very close to the upper error limit of 2.00%.

Note

This example demonstrates the difference between using Upper error limit frequency and Number of Tolerable Errors to evaluate control deviations.

If you use the more lenient Number of Tolerable Errors method, the voucher control is operating effectively.

The number of observed errors in the "Number of Errors" column to the left is 6, which does not exceed the Number of Tolerable Errors of 6 reported when you calculated sample size.

7 2.24% 119

(0.0224 x 5298)

The voucher control is not operating effectively.

2.24% > the upper error limit of 2.00%

10 2.89% 153

(0.0289 x 5298)

The voucher control is not operating effectively.

2.89% > the upper error limit of 2.00%