Tekijä: Stefan Papp; Wolfgang Weidinger; Mario Meir-Huber; Bernhard Ortner; Georg Langs; Rania Wazir Kustantaja: Hanser Fachbuchverlag (2019) Saatavuus: Ei tiedossa
Tekijä: Stefan Papp; Wolfgang Weidinger; Bernhard Ortner; Annalisa Cadonna; Georg Langs; Roxane Licandro; Mario Meir-Huber; Nikol Kustantaja: Hanser Publications (2022) Saatavuus: Noin 12-15 arkipäivää
Tekijä: Kristof Kloeckner; John Davis; Nicholas C. Fuller; Giovanni Lanfranchi; Stefan Pappe; Amit Paradkar; Larisa Shwartz; Sure Kustantaja: Springer (2018) Saatavuus: Noin 17-20 arkipäivää
Tekijä: Stefan Sudhoff; Denisa Lenertova; Roland Meyer; Sandra Pappert; Petra Augurzky; Ina Mleinek; Nicole Richter; J Schließer Kustantaja: De Gruyter (2006) Saatavuus: Noin 11-14 arkipäivää
Apress Sivumäärä: 400 sivua Asu: Pehmeäkantinen kirja Painos: 1st ed. 2016 Julkaisuvuosi: 2016, 08.06.2016 (lisätietoa) Kieli: Englanti
Data Processing is one of the core functionalities of distributed and cloud computing. There is a high demand on low latency and high performance computing as well as the support of abstract processing methods such as SQL querying, analytic frameworks or graph processing by data processing engines.
The Definitive Guide to Apache Flink by Papp starts with the history of Big Data processing with Hadoop and explains the shortcomings of Map Reduce. It shows how YARN and Hadoop 2.x changed the game and how new technologies started to compete to become the successor of Map Reduce.
After some detailed information on Tez and Spark and how they try to solve shortcomings of Map Reduce, this book deals with some architectural patterns for creating a solid data processing engine, such as advanced pipelining methods or in-memory caching. It shows how Flink is using these concepts.
Flink programming will be introduced in a hands-on approach. It starts with how to create a ten minutes build and how to run the first "Word Count" with Flink. Then it continues with more advanced topics such as programming more complex programs. All samples are programmed with Java or Scala.
It shows that Apache Flink has the potential to become one of the key technologies for distributed computing. It aims to replace many small technologies with a more powerful one that covers many aspects of Hadoop programming.