Innovation Update

From many, comes one (algorithm)

Date: March 1, 2023
Read Time: 8 mins

Recent anticorruption research shows that when companies collaborate to share information about third-party payments and high-risk transactions, they have a 25% greater chance of predicting improper payments than when each company’s model is performed in isolation. A new data-sharing consortium led by a nonprofit at MIT is working to make such collaboration possible.

According to the World Economic Forum, international corruption can take many forms, including bribery, embezzlement, cronyism, and fraud. And it’s an expensive problem that costs the global economy trillions of dollars annually. (See “Corruption is costing the global economy $3.6 trillion dollars every year,” by Stephen Johnson, World Economic Forum, Dec. 13, 2018.) Detecting it is challenging but possible using advanced analytics and machine learning technologies on top of legal and subject-matter expertise. And, when organizations work together in collaboration, it’s now proven the results are even better. Here we look at how a consortium led by a nonprofit at MIT is helping organizations share data without comprising privacy in their fight against fraud.

Companies work hard to detect and prevent fraud and corruption. Still, it can be challenging for businesses to identify behaviors that cross the line — particularly where employees are determined to commit a crime and then take steps to conceal their behavior. As anti-fraud professionals, we’re increasingly using data analytics to identify and monitor these risks for our organizations or clients. A critical factor when evaluating compliance programs involves determining if compliance and control personnel have sufficient access to relevant data sources. Can they access information to implement timely monitoring, policy testing and controls evaluation? As the U.S. Department of Justice (DOJ) indicates, this is even more critical in a regulatory environment that increasingly requires monitoring amid the ever-expanding availability of new data sources. This is true not just during due diligence but “throughout the lifespan of the third-party relationship,” says the DOJ. [See “Evaluation of Corporate Compliance Programs (Updated June 2020),” U.S. Department of Justice, Criminal Division.]

Corruption often tends to flourish where multiple actors compete in opaque markets in a race to the bottom, often leaving victims unaware of wrongdoing occurring across multiple organizations. Against that backdrop, we shouldn’t attack corruption alone in organizational silos, but rather encourage transparency across organizations and industry sectors, sharing insights, risk profiles and third-party attributes that describe a potentially improper or corrupt payment.

Even so, this approach raises numerous challenges. First, detecting signs of fraud from “noise” across uneven and sometimes unreliable datasets is tough. Even the most sophisticated organizations tend to only have access to reliable data related to their own vendors and customers, which is just a fraction of the global marketplace. Second, it can be difficult to create a transparent anti-fraud framework while also preserving the required privacy necessary for organizations to maintain their competitive edge. It makes little sense to trade a corruption problem for a competition or privacy one. A third challenge is to create a collaborative framework that’s cost-effective. Technologies tend to flourish only when their new users don’t have to risk significant resources on a “bet” that it will work. Last, contributing members in any collaborative effort need to trust each other. It’s important that participants are confident that each member is acting in good faith in terms of how they are applying insights and translating such results into action.

Overcoming those challenges is worth the effort, however. Collaboration initiatives, such as data-sharing consortiums, can be a valuable resource to organizations in the fight against fraud and corruption. In 2022, the ACFE/SAS Anti-Fraud Technology Benchmarking Report asked participants if their organizations were contributing to data-sharing consortiums to help prevent or detect fraud. Surprisingly, over half said they either currently contribute (34%) or don’t contribute but would be willing to in the future (24%). (See the graph below.) The research coming out of MIT’s Integrity Distributed (InDi) may be just that solution.

who contributes to data sharing consortiums

About InDi

InDi is a nonprofit initiative that allows organizations all over the world to contribute their anti-fraud and anticorruption intelligence in a secure, anonymous information-sharing consortium model. The platform allows organizations to train algorithms that detect patterns of fraud and corruption in their respective industries. Those algorithms (but not the underlying data) are contributed to the consortium, enhancing the collective intelligence of the “super-algorithm” in a secure platform. The process is iterative and collaborative, feeding data-hungry algorithms for optimal performance. As more data and algorithms are contributed, artificial intelligence learns and improves at a rate much faster than would be possible at any single company. Through this unique model of collaboration and data analytics, all of the participating organizations benefit, and the learning is continual throughout the integrity distributed network. A short video clip provides more information on InDi and the participating law firms who helped advise on this initiative.

Even the most sophisticated organizations tend to only have access to reliable data related to their own vendors and customers, which is just a fraction of the global marketplace.

About the research

In a limited research study funded by the AB InBev Foundation, InDi brought together forensic accounting technology and data science professionals from Kona AI, MIT and Harvard Business School, working in cooperation with Fortune 500 companies and leading AmLaw 100 law firms that focus on white-collar crime and anticorruption, to design, develop, implement and test the new platform.

Several companies extracted relevant third-party payments data from their enterprise resource planning systems (such as SAP or Oracle procure-to-pay systems) and loaded the information into a consistent unified data model (or UDM) developed by the company I work for, Kona AI. On a company-by-company basis, payment risks were risk-scored, first by the model and across an extensive library of tests and behavioral algorithms, and each company’s representative and/or its outside counsel reviewed the highest-risk transactions. Using an approach that originated in e-discovery known as technology-assisted review, the team created a predictive model for each company designed to proactively identify a potentially improper payment based on the attributes of each transaction. (See my Innovation Update column, “Using Technology-Assisted Review to Uncover Suspicious Transactions,” Fraud Magazine, November/December 2022.) Finally, the team combined each company’s model into one super-model, using a neural-network statistical model to retain and share insights while protecting data privacy and anonymity.

“This research was the first-of-its kind in terms of companies working together to fight global corruption,” says Francis Hounnongandji, CFE, president and CEO of the Institut Francais de Prevention de la Fraude. “The concept around data-sharing consortiums to fight financial crimes or cyber risks is certainly not new. Seeing companies work together and driving results in this initial cohort, without having to share underlying data, looks promising for compliance programs in the future.”

With this initial cohort of companies, the results of the super-model indicate that the predictive value of identifying a potentially improper payment is 25% greater when companies collaborate compared to results when each company’s model is performed in isolation. As more companies are added to the cohort in 2023, the research team expects the super-model results will continue to improve.

Building predictive models to fight global corruption

The InDi consortium is building upon recent advances in the decentralized machine learning and privacy-enhancing technologies to develop what MIT calls split learning. This is a technique that allows participating entities to train machine learning models without sharing any raw data. (See “MIT Media Lab’s Split Learning: Distributed and collaborative learning”.) InDi participants integrated split-learning technology with the know-how of anticorruption experts and workflows to form a distributed model to detect vendor fraud, corruption and circumvention of controls.

Financial data from SAP or Oracle (or both) was exported from each company’s enterprise resource planning (ERP) system, a software used to manage different operations in a firm, such as accounting. [See “Enterprise Resource Planning (ERP): Meaning, Components, and Examples,” Investopedia, Sept. 10, 2022.] Specifically, procure-to-pay vendor payment information was the key data source. Additional third-party-risk questionnaires, watchlists or historical investigative results also supplemented and enriched the data. The data was then passed through an extensive library of over 100 anti-fraud, corruption and circumvention-of-controls tests and behavioral algorithms that participants provided. Thereafter, it was risk-scored (i.e., prioritized) based on those transactions meeting multiple risk criteria. Further, some companies were able to enhance their dataset with existing, known investigation findings such as high-risk transactions, vendors, general ledger accounts and other descriptive risk criteria to accelerate training of their predictive model. We found that those companies that tracked that information and could feed this information into their predictive model had noticeably better results based on standard performance metrics.

consortium algorithms

The figure above illustrates how company-specific data was maintained and analyzed, with insights and algorithms being shared in a secure manner with InDi. This partnership between InDi and the participating companies is what forms the consortium — the Kona AI platform was simply the technology used to house the consortium concept and run the algorithms.

We found that those companies that tracked that information and could feed this information into their predictive model had noticeably better results based on standard performance metrics.

Predictive model results

InDi hypothesized an improved predictive model performance using distributed machine learning algorithms, such as split learning, to better detect fraud, corruption and circumvention of controls without sharing the underlying proprietary commercial data. In line with this hypothesis, the research team observed that the performance of the distributed prediction model’s precision and recall metric (known as an F1 score) increased 25% from 0.47 (individual) to 0.59 (distributed) upon simulating third-party payment data across 50 million records.

Please note that these results were preliminary as of November 2022. InDi plans to continue to add more companies to the consortium with a Phase 2 Cohort beginning the first quarter of 2023. Based on the preliminary results, the InDi team anticipates further model improvement and insights for the consortium participants.

What’s ahead for InDi?

Formed in June 2022, InDi is still in its infancy stage; however, it’s gaining a tremendous amount of interest and support from mid-to-large global organizations and governments. One of the key goals for the first quarter of 2023 is to increase the number of companies contributing to the consortium across a number of industries. As research and funding expands for the nonprofit, InDi may branch out and conduct additional research and model-building beyond third-party risk and into other areas for fraud and corruption.

Vincent M. Walden, CFE, CPA, is the CEO of Kona AI, an AI-driven anti-fraud and compliance technology company providing easy-to-use, cost-effective payment and transaction analytics software around corruption, investigations, fraud prevention and compliance monitoring. He welcomes your feedback and ideas. Contact Walden at vwalden@konaai.com.

Begin Your Free 30-Day Trial

Unlock full access to Fraud Magazine and explore in-depth articles on the latest trends in fraud prevention and detection.