banner

Client

Our client, a prominent financial institution, approached us with concerns regarding fraud detection. As a financial entity, they handle large volumes of email communications daily, making it essential to have efficient tools to detect and prevent fraudulent activities. The client required a robust solution to streamline the analysis of their PST (Personal Storage Table) files, allowing their auditors to quickly identify and act upon suspicious activities within email communications.

Challenge

Traditional methods for accessing and analyzing PST files are cumbersome and time-consuming for auditors. The initial step in a workflow focused on analyzing data is to read the file, which is expected to be straightforward. 

However, users often spend significant time writing and developing a reader from scratch. This inefficiency limits auditors’ ability to concentrate on data analysis, particularly when investigating potential fraudulent activities such as phishing scams and business email compromise fraud.

Solution

At Redfield, we addressed the challenges of managing PST files by introducing the Knime PST Reader. This node allows users to focus on data analysis rather than the technical difficulties of accessing and reading the data.

How We Did It

PST files often contain crucial information for detecting fraudulent activities. In this case study, we use Large Language Models (LLMs) to analyze emails for suspicious activities and identify anomalies that may indicate fraud. The process includes ingesting data into Elasticsearch and creating a basic knowledge graph using Kibana.

PST Reader Implementation

We utilized a PST version of the Enron dataset, which became public during a Federal investigation into Enron’s corporate fraud in 2002. The dataset includes tens of thousands of emails and detailed financial data for top executives. The PST Reader can be installed via the Knime Hub or the Knime Extension Manager.

Fraud Detection

The advent of large language models (LLMs) has brought significant advancements in Generative AI, offering effective solutions for fraud detection. These models can analyze large volumes of textual data, such as emails and reports, to identify suspicious activities and potential fraud indicators.

Traditional Methods vs. LLMs

Traditional fraud detection relies on predefined rules based on known patterns of fraudulent behavior. While these rules are easy to understand and explain, they often struggle to adapt to new or evolving types of fraud. In contrast, LLMs can learn from new data and continuously update their understanding of fraudulent activities, enabling them to detect emerging fraud patterns without human intervention. However, the effectiveness of LLMs depends on the quality of their training data, and ensuring unbiased data can be challenging.

Fraud Detection Using GPT

We employed OpenAI’s GPT-3.5-turbo-instruct model for fraud detection. Using the OpenAI Authenticator and OpenAI Connector, we authenticated and connected to the LLM.

Results

GPT flagged 180 messages as suspicious or showing indicators of fraud. 

For instance, an email mentioning bankruptcy filing and requesting confirmation of termination payments to numerous counterparties, with some amounts appearing unusually high, required further investigation.

Redfield offers a range of services and expertise in data analysis, fraud detection, and workflow optimization. We are ready to assist any organization facing similar challenges, so that they can efficiently manage and analyze their data. Reach out to us for more information!

More case studies: