The advent of Big Data analytics and cloud computing has resulted in an unprecedented increase in the demand for distributed data storage demand. Companies are constantly on the lookout for ways of reducing this cost and improving reliability. Erasure coding has emerged as a promising technique to achieve these goals and most major tech companies have adopted it. However, one major issue in such systems is the characterization and optimization of access latency when data objects are erasure coded in distributed storage.
In this monograph, the authors provide a review of recent theoretical and practical progress on systems that employ erasure codes for distributed storage. Starting with an overview the key challenges and research problems, the authors give an overview of different models and approaches that have been developed to quantify latency of erasure-coded storage. They also extend the discussions to video streaming from erasure-coded distributed storage systems. Practical implementations of erasure-coded storage are then discussed in real-world storage systems such as in content delivery and caching.
This monograph is aimed at students, researchers and practitioners in information theory active in the research and development of modern day distributed storage systems.