The Evolution of Graph Data Structures
By Ricky Sun
Graphs are used to represent real-world applications, especially when these applications are best represented in the form of networks, from road networks, telephone or circuit networks, power grids, and social networks to financial transaction networks. If you have not worked in a relevant domain, you may be surprised how widely graph technologies are used. To name a few top-notch tech giants that live on graphs:
- Google: PageRank is a large-scale web-page (or URL if you will) ranking algorithm, which got its name from Google’s founder Larry Page.
- Facebook: The core feature of Facebook is its Social Graph, the last thing that it will ever open-source will be it. It's all about Friends-of-Friends-of-Friends, and if you have heard of the Six-Degree-of-Separation theory, yes, Facebook builds a huge network of friends, and for any two people to connect, the hops in between won't be exceeding 5 or 6.
- Twitter: Twitter is the American (or world-wide) edition of Chinese Weibo (and you can say the same thing that Weibo is the Chinese edition of Twitter), it ever open-sourced FlockDB in 2014, but soon abandoned it on Github. The reason is simple, though most of you open-source aficionados find it difficult to digest, that is, graph is the backbone of Twitter's core business, and open-source it simply makes no business sense!
- LinkedIn: LinkedIn is a professional social network, one of the core social features it provides is to recommend a professional that's either 2 or 3-hop away from you, and this is only made possible by powering the recommendation using a graph computing engine (or database).
- Goldman Sachs: If you recall the last world-wide financial crisis in 2007-2008, Lehman Brothers went bankruptcy, and the initial lead was Goldman Sachs withdrawing deals with L.Brothers, the reason for the withdrawal was that Goldman employs a powerful in-house graph DB system – SecDB, which was able to calculate and predict the imminent bubble-burst.
- Paypal, eBay and many other BFSI or eCommerce players: Graph computing is NOT uncommon to these tech-driven new era Internet or Fintech companies – the core competency of graph is that it helps reveal correlations or connectivities that are NOT possible or too slow with traditional relational databases or big-data technologies which were not designed to handle deep connection findings.
The below diagram (Diagram-0) shows a typical social graph network. It was dynamically generated as an instant result of a real-time path computing query against a large graph dataset. The green node is the starting node and the purple node is the ending node, there are 15 hops in between the pair of nodes, and over 100 paths are found in between. Along each path, there are different types of edges connecting the adjacent nodes, with edges colored differently to indicate different types of social relationships.
Diagram-0: A Typical Social Graph