Fraud EDge: A forum for fraud-fighting faculty in higher ed
Social Network Analysis (SNA) has its roots in academics, beginning as an analytical concept in the field of sociology more than 80 years ago. (See Analyzing Social Media Networks with NodeXL: Insights from a Connected World by Derek Hansen, Ben Schneiderman and Marc A. Smith, page 32.) Academic research still dominates this discipline, though more recently, businesses in general and marketing operations, in particular, have been very successful in using SNA in targeting customers, focusing advertising and motivating buyers. Beyond marketing, businesses use SNA to focus product development, reduce customer service costs and to improve public opinion of a firm or its products. We can use the business insights achieved in these traditional business uses of SNA just as effectively in fraud examinations. — Les E. Heitger, Ph.D., Educator Associate
The previous two columns in this series explored the power of unstructured data in fraud examinations and their enhancement by using unstructured and structured data together. In this column, we discuss the use of structured and unstructured data together in SNA to discover the interrelationships between actors in more complex schemes. Fraud fighters have used SNA in cases involving the Foreign Corrupt Practice Act (FCPA), anti-bribery and corruption (ABC), public corruption, financial statement fraud and systemic frauds involving collusive or conspiratorial acts.
Networks and their origins
Networks consist of groupings of related people, animals, things or ideas — from the complex structure of anthills and beehives to the inhabitants of rainforest ecosystems. At the human level, the structure of networks becomes increasingly complex because our relationships involve social and communicative elements, i.e., social networks. These networks form because of our choices of whom we interact — based on ideology, religion, social status, personality types or our surrounding environment.
As we might imagine, these networks of relationships can become quite complex. See Figure 1 below — an illustration of sentiments and stances surrounding a single incident within the Israeli-Palestinian conflict.

In this map, the green cluster represents pro-Palestinian sentiment, the blue pro-Israeli, with various tangentially related groups in red. Nodes (data points) in between reflect less polarized stances. It's amazing this complex map was constructed from only Twitter and a single event.
SNA is the science of studying social networks, sometimes referred to as "graph database analysis" or "relationship mapping." We believe SNA has grown in popularity over the years in response to our increasingly interconnected and digitalized society and recent availability of robust tools to conduct analysis. (More on this later.) The proliferation of email, messaging, electronically stored documents and social media not only provides a rich repository of data for social scientists but also provides a platform for analysis of relationships in fraud examinations.
In the context of occupational fraud, sources of data from which networks are constructed might include, among others:
- Email.
- Structured data.
- Phone records.
- Depositions.
- Documents.
- Human resources data.
- Open source data.
- Social media.
- Medical records.
- Loan documents.
- Interview notes.
- Insurance claims data.
- Online news.
Within these data sources lay the individual data elements for identifying relationships among various individuals and entities. If we collectively analyze these individual elements, we're able to construct network maps that provide the foundation for SNA.
Graph metrics applied to examinations
SNA provides a quick means to "see" complex social networks using data visualization and can often lead to immediate and useful insights. However, more powerful than the visual component of SNA is graph metrics — mathematically derived data about connections and groups.
The simplified graphic and accompanying chart (Figures 2 and 3) illustrate an example of applying SNA and graph metrics in a public corruption fraud examination.


The key graph metrics include:
- Degree — count of connections for an entity.
- Eigenvector Centrality — connection quality measurement.
- Betweenness Centrality — measurement of influence in a network.
Degree is the simplest of graph metrics and represents the number of connections for a given entity. In this case, Shauna is the most well-connected person in the network, with seven degrees, or connections. In investigations, people with a high degree are of interest because they might be a rich source of information about other members of the network.
Shauna also has a high Eigenvector Centrality (EC) — a measure of overall potential influence, factoring in an entity's degree and also the degree of an entity's connections. Her connections to other well-connected people, such as Julia and Tom, boost her EC metric. Also, Todd and others have an equally high EC score, even though their degree is lower. It's important in fraud examinations to consider both the degree and EC metrics to help identify key players in a large network.
However, the key metric in FCPA and corruption investigations is the Betweenness Centrality (BC), as demonstrated by Anthony on the graph. He's in the unique position of being a broker, bridging the gap between the blue and green groups. Rand is a very powerful element in this network, which causes his BC score to be much higher than the other entities. We see this phenomenon repeatedly on FCPA and corruption cases, in which an individual is a bridge between the organization or municipality and a group of outside influencers that are involved in the corrupt activities.
Another important feature of SNA is community detection, which statistically clusters and identifies subgroups within the larger network. Community detection is important in investigations to identify cliques or "buddy networks" — the networks within the networks — that exist, which may be indicative of a collusive or conspiratorial situation.
These metrics are only a subset of possible metrics using SNA; however, they're some of the most useful in a fraud examination.
Evolution of networks
While network relationship maps are effective in a static view, the ability to watch relationships evolve over time is more effective. Watching the maps change — both figuratively and literally — relative to key dates and events might help further identify the importance of entities within a network. Much of the available unstructured data has date and time elements. Incorporating these elements into your SNA and comparing the network maps at points in time is important to increasing fraud examination effectiveness.
Another way to enhance SNA is by combining this approach with those discussed in our previous columns — natural language processing, named entity extraction, tone detection and part of speech tagging in email communications. For example, importing email data into a network analysis tool will allow you to create a simple map based on To/From fields. Taking this a step further, named entity extraction allows you to incorporate actual names, dates and places into a network map. Include the date and time stamps, and you now have a time series analysis of the network map.
Getting started with SNA
Mapping relationships among entities in a fraud examination is a straightforward process. The first relationship map we ever drew for an investigation was done by hand, on paper, and helped reveal the primary entities to interview and revealed previously unknown relationships and ultimately bogus loans and straw borrowers.
Electronic SNA tools that can generate effective, interactive visualizations and generate robust graph metrics are freely available and easy to learn. The most user-friendly tool,
NodeXL, works with the familiar Microsoft Excel program. A companion book, "Analyzing Social Media Networks with NodeXL: Insights from a Connected World," by Derek L. Hansen, Ben Shneiderman and Marc A. Smith, takes a deep dive into the history, metrics and meaning of social network analysis and serves as a user guide for the tool. NodeXL also has automated wizards to import various structured data, social media posts and email.
Other free tools capable of generating presentation-quality interactive visualizations are
Gephi, and
neo4j. High-quality paid tools range from
OrgNet's Inflow to
IBM's Analyst's Notebook, which enjoys wide use and support from the law enforcement and corporate communities. At the very high end is
Quid, an intelligence platform designed for large corporations and government agencies.
Application to fraud examiners
While corruption is the second-most common categorical type of fraud according to the ACFE's 2014 Report to the Nations on Occupational Fraud and Abuse (figure 24), it's the most common fraud scheme in most industries studied. Corruption schemes are based on the use of influence to manipulate and deceive. In ABC and FCPA cases, influence is the root cause of many improper payments. Merely analyzing accounting records and common financial data sets isn't effective enough to identify influence. Leveraging structured and unstructured data together allows us to perform SNA — a method designed to identify and measure influence.
Fraud examiners no longer need to rely on illustrating relationships and inferring influence with a bulletin board, thumbtacks and yarn.
Les Heitger, Ph.D., Educator Associate, is BKD Distinguished Professor of Forensic Accounting in the School of Accountancy at Missouri State University in Springfield. He's chair of the ACFE Higher Education Advisory Committee.
Jeremy Clopton, CFE, CPA, ACDA, is senior managing consultant in the Forensics Practice of BKD, LLP.
Lanny Morrow, EnCE, is a managing consultant in the Forensics Practice of BKD, LLP.
The Association of Certified Fraud Examiners assumes sole copyright of any article published on www.Fraud-Magazine.com or ACFE.com. Permission of the publisher is required before an article can be copied or reproduced.