Although we enjoy being fooled through magic shows or special effects in movies, we do not enjoy being fooled by fraud. Credit card fraud, bank fraud, insurance fraud, and internet sales fraud are common nefarious activities and financial institutions like banks and insurance companies are the worst affected.
Financial fraud is big business, contributing to an estimated 50 billion USD in direct losses annually. This figure is actually much higher, as firms cannot accurately identify and measure these types of losses. The typical organization usually loses 7 percent of its annual revenue to fraud.
Although being caught for fraudulent activity comes with high civil penalties, fraudsters persist. They grow and adapt with the changing technology, becoming increasingly savvy and deploying increasingly sophisticated methods of perpetration. As financial transactions become increasingly technology-driven, they can be the weapon of choice for fraudsters. Fraudsters have also evolved from single individuals to complex criminal rings (organized crime) encompassing international borders. Fraud from organized criminal groups hurts financial institutions the most.
Methods for Detecting Fraud Have Evolved
Traditional fraud detection methods, such as a deviation from normal or expected patterns, concentrate on discrete data rather than the connections between them. Although discrete methods are useful for catching fraudsters acting alone, they fall short in their ability to detect organized crime rings. Further, discrete methods are prone to false positives, which create undesired side effects in customer satisfaction and lost revenue opportunity.
Gartner proposes a layered model for fraud prevention, which can be seen below:
It starts with traditional discrete methods (at the left), and progresses to more elaborate “big picture” types of analysis. The rightmost layer, “Entity Link Analysis”, leverages connected data in order to detect organized fraud. Collusions of the type described above can be very easily uncovered—with a very high probability of accuracy—using a graph database to carry out entity link analysis at key points in the customer lifecycle.
Using Graph Database for Entity Link Analysis
Relational databases require datasets to be modeled as a set of tables and columns. To uncover rings in such a scenario requires carrying out a series of complex joins and self-joins. Such queries will not only be very complex to build but also expensive to run and will pose significant technical challenges on scaling. The full magnitude of this problem becomes clear as one considers the combinatorial explosion that occurs as the ring grows along with the total dataset.
Graphs are designed to express relationships between data. Graph databases can uncover patterns that are otherwise difficult to detect using traditional representations such as tables. Because they are designed to query intricate connected networks, graph databases can be used to identify fraud rings in a fairly straightforward fashion.
Social Network Analysis (SNA) for Robust Fraud Detection
Whenever we think of social network analysis (SNA), the first thing that strikes our mind is social media. But SNA is beyond just Facebook, Twitter, LinkedIn or Google Plus. Social network is a network of entities all connected in a particular way. The entities can be credit cards, companies, merchants, fraudsters, or others. It can include transactional data, such as online transactions and banking data, social media data, call behavior data, IP address information, geospatial data etc.
This data is often stored in unstructured formats in environments like social media, telecom registries, payment gateways or bank servers. The good news is that methods exist to probe such large networks of relationships and establish suspicious patterns of behavior through graph database technology that has been specifically developed to work with big datasets that have connections and relationships. Storing and retrieving interconnected information in a native ‘network graph’ format can deliver interactive network visualizations to discover hidden structures, locate clusters and patterns, identify links in transaction chains, and apply specialized algorithms to identify suspicious patterns.
NOSQL graph databases store and retrieve data in a native network format. Neo4J is a market leading graph database which can be rapidly implemented and is highly scalable. Advanced analytics methods such as machine learning are already applied to detect fraudulent transactions. Along with such analytical methods, SNA with graph databases can significantly reduce the false positive ratio in fraud detection.
About the Author
Software Engineer - Big Data ServicesMore Content by Angshuman Talukdar