DESCRIPTION
Graph data structures are nothing more than representations of the relationship between entities. Although graph data tends to be intuitively understandable, graph algorithms must be extremely powerful and scalable to manage the nearly-incalculable potential relationships within large data sets. To efficiently process graph data, an equally powerful graph processing framework like Apache Giraph is essential. Apache Giraph supplies many algorithms needed to draw conclusions from graph data, but can also be used to design custom graph algorithms. Whether trying to identify patterns in social data, optimize the traffic on a network, or any set of highly-connected data, Giraph has the tools that allow users to focus on the meaning of data instead of the chore of processing it.
Giraph in Action is a comprehensive guide that teaches the application of the Apache Giraph programming model to real-world graph data examples. It starts by showing how to mine graph data using the most straightforward algorithms. Then, it dives into the Giraph architecture and the main APIs as readers discover how to model and process more complex scenarios. Along the way, it offers techniques for handling data from disparate sources, swapping data in and out of memory, and running Giraph in the cloud.
RETAIL SELLING POINTSUnlocks the value of big graph data
Practical real-world large scale graph processing examples
Advanced Graph features
AUDIENCE
The book assumes an understanding the setup of Hadoop clusters and that you're familiar with Java or another OO language. No experience with graph theory or graph algorithms is required.
ABOUT THE TECHNOLOGY
Giraph, a technology to run analytics over very large graphs such as the internet, social networks, bank transactions, and others, allows the computation of metrics such as the most influential pages in the Web, communities inside of a Social Network, and fraud in bank transactions. It is also used widely in academic research to analyse graphs for scientific computing, or as a base for the study of large-scale graph processing.