Become an expert at using Python for advanced statistical analysis of data using real-world examples
About This Book
• Clean, format, and explore data using graphical and numerical summaries
• Leverage the IPython environment to efficiently analyze data with Python
• Packed with easy-to-follow examples to develop advanced computational skills for the analysis of complex data
Who This Book Is For
If you are a competent Python developer who wants to take your data analysis skills to the next level by solving complex problems, then this advanced guide is for you. Familiarity with the basics of applying Python libraries to data sets is assumed.
What You Will Learn
• Read, sort, and map various data into Python and Pandas
• Recognise patterns so you can understand and explore data
• Use statistical models to discover patterns in data
• Review classical statistical inference using Python, Pandas, and SciPy
• Detect similarities and differences in data with clustering
• Clean your data to make it useful
• Work in Jupyter Notebook to produce publication ready figures to be included in reports
In Detail
Python, a multi-paradigm programming language, has become the language of choice for data scientists for data analysis, visualization, and machine learning. Ever imagined how to become an expert at effectively approaching data analysis problems, solving them, and extracting all of the available information from your data? Well, look no further, this is the book you want!
Through this comprehensive guide, you will explore data and present results and conclusions from statistical analysis in a meaningful way. You'll be able to quickly and accurately perform the hands-on sorting, reduction, and subsequent analysis, and fully appreciate how data analysis methods can support business decision-making.
You'll start off by learning about the tools available for data analysis in Python and will then explore the statistical models that are used to identify patterns in data. Gradually, you'll move on to review statistical inference using Python, Pandas, and SciPy. After that, we'll focus on performing regression using computational tools and you'll get to understand the problem of identifying clusters in data in an algorithmic way. Finally, we delve into advanced techniques to quantify cause and effect using Bayesian methods and you'll discover how to use Python's tools for supervised machine learning.
Style and approach
This book takes a step-by-step approach to reading, processing, and analyzing data in Python using various methods and tools. Rich in examples, each topic connects to real-world examples and retrieves data directly online where possible. With this book, you are given the knowledge and tools to explore any data on your own, encouraging a curiosity befitting all data scientists.