A leading joint-stock retail & commercial bank has been exploring ways to identify fraudulent risks associated with their debit cards, they have issued 170 million cards, and on daily basis these cards conduct tens of millions of transactions involving 260 million cards (including 90 million cards or merchant POS accounts issued by other financial institutes). Their legacy transaction anti-fraud system was built on top of Apache Spark GraphX framework which has 3 major drawbacks:
The average latency for transaction fraud detection is over 300ms, and the highly concurrent nature of online transactions cause traffic to pile up and wait forever for a decision to be made. Online fraud detection must be real- time, every extra second taken to respond means degraded user experience and friction for trusted users.
Some transactional counterparties are having tens of thousands of transactions a day, these counterparties are considered supermodels and their associated transactions have to be removed in the Spark system to allow for faster transactional behavior analysis, therefore causing the transaction network incomplete (and decision made potentially inaccurate).
Spark can only process static data after it’s ETLed, but transactions keep on flowing in. Even though some vendors have claimed to use Spark as real-time data warehouse, it’s simply not.
The bank understands that they need a graph database to deal with the real-time fraud detections of card transactions. They played with Neo4j Enterprise Edition and realized that though it’s faster than Spark by 30%, it still can’t handle supernodes and can’t match their expectation of 30ms latency for each transaction analysis.
Ultipa Graph helps the bank’s IT department setup a 4-instance cluster in hours (with 1-click fast deployment) and developed their in-house graph-based fraud detection solution in weeks by integrating with Ultipa’s Java SDK.
Louvain Community Detection for Fraud-Detection
This is the first-time ever the bank’s IT sees a complete graph data network with all supernodes included, and no backlog is piled up when tons of transactions are rushing in during peak hours as they would experience otherwise with Apache Spark or Neo4j.
|RTD System Comparison||Ultipa Graph (v3.2)||Neo4j Enterprise Apache Spark + GraphX Edition (v4.x)||Apache Spark + GraphX|
|Cluster Size (instances)||4 (HTAP)||3 Hot-standby||12|
|Vertices (Million)||260||260||260M minus all hotspots (e.g., POS)|
|Transactions (Million)||1,000||100||100 minus hotspot supernodes|
|Data Volume (Days)||90||7||7|
|Avg. Latency (Millisecond)||20||>200||>300|
|Graph Models Processed (Per-Transaction)||>20||<10||<5|
|Hotspot Nodes?||Yes||No (Traversal of supernodes will be extremely slow)||No (supernodes must be removed first)|
Behold the power of real-time graph computing!
For the first time ever, the bank’s IT department sees a complete card transactional behavior network with all supernodes included, and no backlog is piled up when tons of transactions are rushing in during peak hours as they would experience otherwise with Spark or Neo4j. The ROI and performance gain is in the range of 100s to 1,000 of times, if you zoom into each transaction’s decision-making, each model takes only 0.1 millisecond (100 microseconds), that’s the beauty of real-time graph computing, only from Ultipa.