Quick Data Exploration with pandas-profiling:

Last week I came across a Python package called pandas-profiling before that I dont know how powerful this package is, I mean super easy to use and get the first glance of data in super easy way. Its purpose is to automate a lot of descriptive analysis that many Data Scientists tend to do when they first dive in to a new dataset. It is so easy to use, I feel it is a powerful starting point for anyone who is faced with a dataset and an open-ended analysis task.

Though we have pandas df.describe() function is great but a little basic for serious exploratory data analysis.

Let’s run through a quick demo to see how it works.

Installation:
You can install using the pip package manager by running

pip install pandas-profiling

Using conda:
You can install using the conda package manager by running

conda install pandas-profiling

Usage:
The profile report is written in HTML5 and CSS3, which means pandas-profiling requires a modern browser.

Jupyter Notebook:
Start by loading in your pandas DataFrame, e.g. by using

import pandas as pd
import pandas_profiling
df=pd.read_csv("hello.csv", parse_dates=True, encoding='UTF-8')

To display the report in a Jupyter notebook, run:

pandas_profiling.ProfileReport(df)

If you want to generate a HTML report file, save the ProfileReport to an object and use the to_file() function:

profile = pandas_profiling.ProfileReport(df)
profile.to_file(outputfile="/tmp/myoutputfile.html")

Dependencies:
An internet connection. Pandas-profiling requires an internet connection to download the Bootstrap and JQuery libraries.
I might change this in the future, let me know if you want that sooner than later.

python (>= 2.7)
pandas (>=0.19)
matplotlib (>=1.4)
six (>=1.9)

That’s about all there is to it. A quick and easy way to do a lot of descriptive analysis in a very short amount of time.

I will say a must try as it is super powerful in terms of visualization, That’s my opinion.about “pandas-profiling”

Have fun and Happy learning 🙂

Quick Data Exploration with pandas-profiling:

Published by llamasearch

Leave a comment Cancel reply

Share this:

Related

Published by llamasearch

Leave a comment Cancel reply