benford() method
Counts the number of times each leading digit (1–9), or leading digit combination, occurs in a numeric column, and compares the actual count to the expected count. The expected count is calculated using the Benford formula.
Syntax
dataframe_name.benford(on = "numeric_column", leading = number_of_digits, addbounds = True|False)
Parameters
Name | Description |
---|---|
on = "numeric_column" |
The numeric column to analyze. Note Select a column that contains "naturally occurring numbers", such as transaction amounts. Benford analysis is not suitable for numeric data that is constrained in any way. |
leading = number_of_digits optional |
The number of leading digits to analyze. If you omit leading, the default value of 1 is used. |
addbounds = True | False optional |
If two or more counts in the output results exceed either of the bounds, the data may have been manipulated and should be investigated. If you omit the parameter, upper and lower bound values are not included. |
Returns
HCL dataframe.
Examples
Test a numeric column for leading digit irregularities
You use the benford() method to test the leading two digits in the Amount column for deviation from the expected counts:
accounts_receivable.benford(on = "Amount", leading = 2, addbounds = True)