Benford's Law, Fraud Magazine
Featured Article

Benford's Law still works

Can Benford’s Law practically identify fraud? It’s one of many tests you can use to discover fictitious numbers in supposedly random data sets, such as monetary amounts of purchase transactions. In this case, a comptroller successfully uses Benford’s Law to search for anomalies in warranty claims.

John, the controller of Rafal Inc., is in a quandary. Peter, the company’s financial analyst, says he’s found an abnormal spike in the warranty claims for FY 2019 as compared to FY 2018. John is perplexed as to how a discrepancy could’ve found its way into the payment process when he knows for certain that his team has been following tried-and-true documented “enterprise resource planning” procedures. He discusses the issue with Tim, a forensic consultant, who introduces him to the world of Benford’s Law. The heralded law could settle the issue. (Names and details have been changed in this case history.)

Benford’s Law and its historical evolution

Benford’s Law is a statistical method for detecting any manual intervention in an otherwise automated operational transaction activity.

In 1881, Simon Newcomb, an American astronomer, made an observation in the pattern of usage of logarithm tables. He found that the logarithm pages that began with “1” were more worn out than other pages. He inferred that pages commencing with 1 had far more frequent usage. (See Newcomb’s paper, Note on the Frequency of Use of the Different Digits in Natural Numbers.)

In the 1920s, Frank Benford, a physicist at General Electric, observed — as did Newcomb — the pages of logarithm table books covering numbers with the initial digits “1” and “2” were more worn and dirtier than pages for “7,” “8” and “9.” (See Digital Analysis Using Benford’s Law, by Mark J. Nigrini, Ph.D., Global Audit Publications, 2000.)

Because the first few pages of a logarithm book list multi-digit logs beginning with the digits 1, 2 and 3, Benford theorized that scientists spent more time dealing with logs that began with those numerals. He also found that with each succeeding first digit, the amount of time scientists used it was decreased.

So, he concluded that in a population of naturally occurring multi-digit numbers, those numbers beginning with 1, 2 or 3 must appear more frequently than multi-digit numbers beginning with the digits 4 through 9. Also, the first digit of the numbers will be distributed in a predictable and expected way. Instead of the frequencies of the first digit being equal (a 1 out of 9 chance for each of the digits 1 through 9), the first digit of a multi-digit number typically follows a different pattern. Predictable patterns also occur in the second and third digits of multi-digit numbers. However, in this article, I limit its application to the first digit only.

(For more information, see the online ACFE Fraud Examiners Manual, Section 3: Investigation/Data Analysis and Reporting Tools/Using Data Analysis Software.)

In 1938, Benford tested his hypothesis with data across 20 different domains with a total of 20,229 observations. His population of data included surface areas of 335 rivers, the sizes of 3,259 U.S. populations, 1,800 molecular weights, the street numbers of scientists listed in an edition of American Men of Science and the numbers in an issue of Readers Digest, among others. The premise was based on the idea that the first digit of a data set follows a logarithmic progression. The data analysis supported Benford’s hypothesis. (See The Law of Anomalous Numbers, by Frank Benford, Proceedings of the American Philosophical Society, March 31, 1938, volume 78, No. 4.)

Mark J. Nigrini, Ph.D., published the article, “Using Digital Frequencies to Detect Fraud,” in the April/May 1994 issue of the ACFE’s The White Paper, the precursor to Fraud Magazine. “This article offers fraud examiners and auditors a new tool to consider in the detection of fraud,” he wrote. And, indeed, it was a beginning. (Nigrini, now a professor at West Virginia University, went on to help popularize Benford’s Law in the latter part of the last century.)

So, the goal of a Benford’s Law analysis is to identity fictitious numbers. Most fraudsters fail to consider the Benford’s Law pattern when creating false documentation of transactions to cover their tracks. Consequently, testing data sets for the occurrence or non-occurrence of the predictable digit distribution can help identify included numbers that aren’t legitimate.

Salient features of Benford’s Law

Benford’s Law distinguishes between natural and non-natural numbers. Natural numbers are those that aren’t ordered in a numbering scheme and aren’t generated from random number systems. For example, currency values that are natural numbers populate most vendor invoice totals or listings of payment amounts.

Conversely, non-natural numbers (such as identification numbers or assigned numbers, such as for Social Security accounts, bank accounts and car registrations) are designed systematically to convey information that restricts the natural nature of the number. Any number that’s arbitrarily determined, such as the price of inventory held for sale, is considered a non-natural number.

Other Benford’s Law features include:

  • The formula is applicable for category variables. Continuous variables, such as age, height, weight, time and amounts, can be clustered into categories.
  • The general rule is that a data set must contain at least 1,000 records to be effective. (More transactions will make the results more precise.)
  • Transaction-level data will make for a better set than data organized in formats. 
  • Chi-squared statistics and “goodness-of-fit” tests will help interpret the results.

The application of the principles of Benford’s Law will only provide indicators of intervention or compromise of data. We shouldn’t construe these indicators to be evidence of wrongdoing. Anomalies would require further assessment and/or evaluation with other substantive tests. (See Benford’s Law: Applications for Forensic Accounting, Auditing and Fraud Detection, by Mark J. Nigrini, Ph.D., John Wiley & Sons Inc., 2000.)

Applicability and relevance of Benford’s Law are more predominant in the functions of internal audit, forensic, risk and compliance, manufacturing operations, inventory analysis, supply-chain management, warranty claims and settlement, and financial payments.

Transactions where we can apply Benford’s Law

  • Payments to operational vendors.
  • Commission payments to distributors and agents.
  • Settlement of parts and labor warranty claims.
  • Cash receipts in retail outlets.
  • Bill of materials composition in an engineering manufacturing unit compared to actual consumption.
  • Consumables such as oil and diesel in large manufacturing plants.
  • Consumption of housekeeping materials in hospitals and hotels, such as linens, towels, bedsheets and toiletries.

We can use Benford’s Law to uncover such frauds as:

  • Shell company (fictitious vendor) schemes, in which the perpetrator concocts amounts to use on fraudulent invoices submitted by a supposed vendor.
  • Cashiers who ring fictitious refunds on cash registers.
  • Bid-splitting and other schemes involving limit circumvention, in which a fraudulent transaction often will begin with a digit that’s just below the threshold.

Analysis of Benford’s law results with chi-squared

And now for some statistical analysis. (Please consult your organizational statisticians if some of the following doesn’t make sense.) We can use several tests to analyze data sets to see if they conform to the expected frequency of occurrence as stipulated by Benford’s Law, including Euclidean distance, Freedman-Watson u-square, Z-statistics, chi-squared, Joenssen’s JP-square, Kolmogorov-Smirnov and mean absolute deviation.

Simple descriptive statistics like mean, median, mode and standard deviation give the meaning of a distribution by removing the outliers. We can use regression analysis to examine relationships between continuous variables. However, when we’re trying to determine patterns — such as customer preference, location and behavior — the chi-squared test is suitable. It can help us visualize data patterns and assist in informed decision-making.

This is the formula for calculating chi-squared statistics:

X² = ∑ (O-E) ² /E, where
X² = Chi-squared statistical value
O = Observed frequency of data set
E = Expected frequency

And here are the important variables for a chi-squared test:

  • Data must be generated from a natural-number set, which means that we can’t consider assigned numbers, parts references, etc. A typical variable that can be considered to be natural is the total amount of accounts payable data set.
  • Data should be capable of being characterized as “categorical variables.” These variables can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation a group or nominal category based on some qualitative property. (See the website, The Practice of Statistics, by Dan Yates, David S. Moore and Daren S. Starnes.) They can be designated as attributes that would signify the qualities of a substratum of data set. In this case, we’d associate the beginning of each data set with an identified first digit. We’re likely to have digits starting from 1 to 9 or nine categories in all.
  • “Degrees of freedom” is the number of independent category variables or similar information used in the computation of the chi-squared statistic, less one.

Here are the steps to follow in the computation:

  1. Populate the expected frequency.
  2. Populate the observed frequency.
  3. Find the difference between observed and expected frequency.
  4. Square the difference.
  5. Divide the difference obtained in step 4 by the number of expected frequencies.

Back to Rafal Inc. and a probable manipulation in warranty claims data

Rafal Inc. was in the business of manufacturing and selling cardio exercise machines across the U.S. for 10 years. The company sold its machines under warranty for spare parts and labor. End customers (mostly hospitals and nursing homes) could claim warranties either with authorized dealers or directly with the company.

The company consistently updated in its ERP financial software its list of warranty replacement parts (spares) plus standard costs for the parts and labor. And Rafal debited cost of goods sold (COGS) with the standard costs at the time of depletion of inventory and subsequent shipment for parts replacement.

The company included warranty spares and labor costs in the books of account based on a mathematical warranty estimation model that had a weighted average cost of past claims, their failure history, economic useful life, supply pipeline, availability of substitutes and their costs, plus the life of the substitutes and the U.S. regions in which customers made claims in nine-, five- and three-year periods.

Where records of spare part claims were available for less than three years, the basis of estimates was one of simple mean — an average of the actual claims made over the actual period of claim. The authorized dealers lodged their claims based on their technical mechanical evaluations. Rafal made the warranty claims payout to dealers after it approved them as per the authorization matrix.

In this case, Rafal gives John, the newly appointed controller, the task of reducing spiraling labor costs. John has his work cut out for him. He must figure out the reasons for the increasing trend, do a root-cause analysis and then ascertain steps to remediate the shortcomings, if any. He consults with Tim, the forensic professional, who suggests applying Benford’s Law to discover any possible intrusions or compromises in the booking of labor claims by the authorized dealers to Rafal.

Tim gives these steps:

  1. Determine and state the null and alternate hypotheses.
  2. Set the criterion for rejecting the null hypothesis.
  3. Calculate the “t-test statistic”; in this case, it will be chi-squared test statistics.
  4. Tabulate research findings.
  5. Interpret the results and decide whether to reject or accept the null hypothesis. (If they reject the null hypothesis, they use the results to investigate further on the variances or discrepancies.)
  6. Draw conclusions.

(Note: In this case, no statistical difference between observed and expected frequencies of the first digit of the data set means there’s no compromise or intrusion in the data set. This forms the null hypothesis. However, we might need to test the null hypothesis using the “level of significance” test. Level of significance is the limit of probability of incorrectly rejecting the null hypothesis when it’s actually true. Where the critical value computed by the chi-squared test on the observed frequencies is greater than that computed at a desired level of significance, it would be appropriate to reject the null hypothesis and go for the alternate hypothesis.)

Steps 1, 2 and 3

Step 1: Null hypothesis (Ho: 0) = E: There’s no statistically significant difference between the observed and expected logarithmic frequencies of warranty labor claims as per Benford’s Law.

Step 2: Alternative hypothesis (Hα: 0) ≠ E: There’s a statistically significant difference between observed and expected logarithmic frequencies of labor claims as per Benford’s Law.

Step 3: Here’s the computation of the chi-squared statistic:

Here’s the graphical representation in two forms of Benford’s Law applicability on warranty labor claims:

Steps 4, 5 and 6

Tabulate, interpret and draw conclusions

  • The chi-squared statistic is 105.3016 for the entire data set. We’ve computed it to compare it to a desired level of significance and decide if we should accept or reject the null hypothesis.
  • Degrees of freedom is 8, i.e. 9 to 1. (Degrees of freedom are the number of independent category variables or similar information used in the computation of the chi-squared statistic, less one.) We had 9 first-digit variables; hence the degree of freedom is 8.
  • Critical values for chi-squared distribution for eight degrees of freedom:
Level of significance Critical values for chi-squared distribution
1% 20.090
5% 15.507
  • Derived chi-squared statistic of 105.3016 is far higher than the critical values at eight degrees of freedom — both at 1% and 5% levels of significance. Hence the null hypothesis is rejected. The alternative hypothesis holds good.

This would mean that observed frequencies don’t follow the expected frequencies of Benford’s Law.

Further analysis reveals a marked spike in the observed frequency of the first digit of numerical 9 and correspondingly lower frequency in the first digit of numerical 1. The rest of the first-digit numbers (2 to 8) seem to be approximately distributed in line with expected frequencies. John sees this as a major discovery to spur further investigation.

  • John refers to the authorization matrix for processing labor claims. His hunch turns out to be supposedly true:
Labor claims in U.S. dollars Approval
$1 to $999 Location claims manager
$1,000 Regional claims manager

It was quite evident that the majority of the warranty claims had the first digit as 9. Is this finding itself enough to establish that claims that otherwise would be in excess of $1,000 were purposefully made out to be less than $1,000 to circumvent the approval requirement of the regional claims manager? No, unless further substantive tests are made on the result.

  • John further drills down the data population to specific authorized dealers and observes that the spike is most prominent in the southern region and more specifically with Smartcardio, the largest authorized dealer in that region.
  • John analyzes the pattern of claims and concludes that even for minor services under warranty, the claims of which should have been in the range of $100 to $199, the actual claims preferred by Smartcardio were in the range of $900 to $999. This was quite alarming.

John finds the local claims manager at Smartcardio in the southern region had fraudulently approved all frivolous and minor claims in the range of $900 to $999, which otherwise should’ve been in the range of $100 to $199. He pocketed the difference of $700 per claim.

The author emphasizes that this article doesn’t test the veracity of Benford’s Law but only aims to apply its principles and explain its usefulness in a business scenario. - ed.

Venkat Keshav Pillai, CFE, ACA, is a former director at a large accountancy firm. Contact him at venkatkpillai@yahoo.com.

 

Begin Your Free 30-Day Trial

Unlock full access to Fraud Magazine and explore in-depth articles on the latest trends in fraud prevention and detection.