The Fraud Examiner

Leveraging Machine Learning in Fraud Incident Response

Jeremy Clopton, CFE, CPA, ACDA, CIDA 
Lanny Morrow, EnCE  

As the office CFE, you recently learned of fraud allegations within your organization. Apparently, someone has been embezzling funds and using the accounts payable system to do so. You now have to begin the investigation to figure out who may or may not be involved — what do you do? You could go the usual route and start with interviews, document review or a flip through the paper bank statements. Or, you could use machine learning to leverage your organization’s data to identify the individuals that exhibit the elements of the fraud triangle and take a focused approach. Which would you choose?

Machine learning can improve the effectiveness and efficiency of your fraud incident response efforts. This article explains the concept of machine learning, discusses applications in structured and unstructured data, touches on the how the fraud triangle comes into play and concludes with ideas for expanding the detection horizon.

Machine learning vs. Rule-based systems

Traditional analytics rely on rule-based methods to detect anomalies. These decision rules follow simple Boolean logic: if a vendor address matches an employee address and its wire transfer account matches the employee bank account, then it is likely a fictitious vendor. While effective, rule-based systems are inherently subjective, geared toward known frauds and limited to a few attributes and exact matching of criteria, machine learning offers a supplement to traditional detection methods.

Machine learning (ML) is a useful type of AI that has the ability to learn without pre-defined decision rules. Machine learning constructs its own decision tree based on meta-tagged data, e.g., “red flag” or “not,” to determine how “red flag” transactions are related. ML applies the learned logic to new data and is remarkably adept at making the right decision. ML’s ability to learn from a complex array of data rather than just a few variables leads to greater accuracy.

Another type of ML, called “unsupervised learning,” constructs decision trees without meta-tagged data; it identifies patterns of interest and anomalies using its own decision-making criteria. This allows fraud examiners to identify new forms of fraud not previously detected or codified into rules-based methods. Both supervised and unsupervised ML systems are self-refining, in that they become more accurate as more data is encountered.


Applications in Structured Data

Machine learning is frequently used to spot “red flag” patterns in structured data (data with a predictable structure, like spreadsheets, databases, and financial data formats). Examples include identifying suspicious insurance claims, unusual banking transactions and credit card activity. It is also useful in network relationship analysis, which is the exploration of connections between individuals and entities. Often complex, relationship networks can be quickly quantified with an unsupervised learning approach called “clustering” allowing the examiner to efficiently identify key relationships and the web of communications and influence. The source of such data is often corporate email, but may also include phone records and social media.

Machine learning also enhances basic attribute matching. Rather than creating a complex set of rules for matching of names, addresses, and other identifying attributes, ML-based systems learn what a match looks like and applies this logic to the data, resulting in a higher degree of accuracy.

In a more specific example, due to the unique and creative ways employees “game” the system with purchasing cards and expense reimbursements, machine learning is also useful in identifying previously unknown schemes in these areas. Case experience has shown ML to be effective in identifying purchasing behavior such as frequent vendors, unusually consistent amounts, certain transactions that occur in tandem with each other on a frequent basis and items occurring with unusual regularity. Similarly, expense reimbursements may be flagged as unusual by an ML-based system that may not conform to typical rules-based observations such as rounded amounts, “just under” threshold amounts and the traditional Benford’s Law analysis.

Sign In

Not a member? Click here to Join Now and access the full page.