Data Analysis With Python Course Summary

Working with data is an extremely important skill nowadays everyone must-have. Even if you don’t have a programming background, it’s essential today to be able to download easily a dataset from the internet, load and represent it as a table, normalize and then apply some analysis. This is a basic skill for driving in the world of data.

I recently accomplished the course Data Analysis with Python from IBM in Coursera. Seven weeks of learning different stages of working with data: its acquisition, representation, normalization, evaluation, and creating predictions were important knowledge and practice.

Of course, Excel is still a pioneer in most of this stuff. You can easily create a spreadsheet, download some JSON/CSV file or create a new one and finally, do anything from filtering, sorting, and pivot tables to making complex calculations and representing graphs. So why to do that in Python for those who might be neither be familiar with a data science nor the programming world? I believe it should.

Here are my thoughts about the new skills I have reached.

Data Analysis With Python

Data acquisition in python is extremely easy in Python, thanks to the Pandas library. Not just for the simplest file loading, but also for downloading and reading JSON/CSV/EXCEL from the web. With only a single row of code.

Automation. At the end of the day in Python, we are writing code. All the operations are presented as a set of consequent operators. Of course, Excel has something similar to macros, which also can be programmatically expanded, but in python, coding is the only way to work, and thanks to Jupiter Notebooks, the easiest way to code and check yourself on each step by running a single row or code set.

Visualization. Python provides a lot of libraries to visualize data. The course covers the main ones: matplotlib and seaborn. Creating the most of the charts require no more than 6–8 rows of code (depending on customizations).

Prediction. Making a decision based on the data is the core of any business today. Finally, if you want to go forward and create a prediction model for your data, Python is the best place. Starting from week 4, you learn how to create basic linear regression models, visualize, measure, predict, and automate it. The course also covers most aspects of machine learning: model evaluation, refinement, overfitting, and underfitting, ridge regression, and pipes.
This course does not cover deep learning.

In summary, I would say that it’s not the question of which tool is better or worse. However, it is important to be able to use different tools for working with data.

Continue Reading Data Analysis With Python Course Summary