Jupyter notebook is by far my all time favorite tool. It is the go-to tool for data exploration for any data scientist or data analyst out there.
Before we get started, let’s answer the question “what is
The goal of the platform is to make collaboration easier for teams on large projects. Git increases speed, integrity, and workflow efficiency. Git is used by more than 1,700 companies around the world (through different git platforms such as GitHub and GitLab).
Pandas is one of the most widely used Python libraries in data science and analytics. Pandas is a large library with immense capabilities. In this introductory blog post to Pandas I will cover the basics of its famous data structure, the DataFrame, and some basic data wrangling techniques.
K-Means clustering is one of the most popular unsupervised machine learning algorithms. It is a classification algorithm, meaning it’s purpose is to arrange the unlabeled data by shared qualities and characteristics.
There are hundreds of Python libraries aimed to make lives easier for data scientists. Some good and some bad, some large libraries covering many areas and some that only do a couple things very well. Here is a list of 5 Python libraries that every data scientist is required to have installed in their environment.