python: distplot with multiple distributions

I am using seaborn to plot a distribution plot. I would like to plot multiple distributions on the same plot in different colors:

Here's how I start the distribution plot:

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
iris = load_iris()
iris = pd.DataFrame(data= np.c_[iris['data'], iris['target']],columns= iris['feature_names'] + ['target'])

sns.distplot(iris[['sepal length (cm)']], hist=False, rug=True);

The 'target' column contains 3 values: 0,1,2.

I would like to see one distribution plot for sepal length where target ==0, target ==1, and target ==2 for a total of 3 plots.

Does anyone know how I do that?

Thank you.

Answers


The important thing is to sort the dataframe by values where target is 0, 1, or 2.

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data=np.c_[iris['data'], iris['target']],
                    columns=iris['feature_names'] + ['target'])

# Sort the dataframe by target
target_0 = iris.loc[iris['target'] == 0]
target_1 = iris.loc[iris['target'] == 1]
target_2 = iris.loc[iris['target'] == 2]

sns.distplot(target_0[['sepal length (cm)']], hist=False, rug=True)
sns.distplot(target_1[['sepal length (cm)']], hist=False, rug=True)
sns.distplot(target_2[['sepal length (cm)']], hist=False, rug=True)

sns.plt.show()

The output looks like:

If you don't know how many values target may have, find the unique values in the target column, then slice the dataframe and add to the plot appropriately.

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data=np.c_[iris['data'], iris['target']],
                    columns=iris['feature_names'] + ['target'])

unique_vals = iris['target'].unique()  # [0, 1, 2]

# Sort the dataframe by target
# Use a list comprehension to create list of sliced dataframes
targets = [iris.loc[iris['target'] == val] for val in unique_vals]

# Iterate through list and plot the sliced dataframe
for target in targets:
    sns.distplot(target[['sepal length (cm)']], hist=False, rug=True)

sns.plt.show()

A more common approach for this type of problems is to recast your data into long format using melt, and then let map do the rest.

import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
import seaborn as sns

iris = load_iris()
iris = pd.DataFrame(data=np.c_[iris['data'], iris['target']], 
                    columns=iris['feature_names'] + ['target'])

# recast into long format 
df = iris.melt(['target'], var_name='cols',  value_name='vals')

df.head()

   target               cols  vals
0     0.0  sepal length (cm)   5.1
1     0.0  sepal length (cm)   4.9
2     0.0  sepal length (cm)   4.7
3     0.0  sepal length (cm)   4.6
4     0.0  sepal length (cm)   5.0

You can now plot simply by creating a FacetGrid and using map:

g = sns.FacetGrid(df, col='cols', hue="target", palette="Set1")
g = (g.map(sns.distplot, "vals", hist=False, rug=True))


Need Your Help

Struggling with Youtube Player Support Fragment

android android-fragments youtube youtube-api android-youtube-api

Im trying to use the Youtube Player Support Fragment in a fragment but the app always crash (NullPointerException) and I have not been able to find any similar post to fix it.

Is there a way to perform a mouseover (hover over an element) using Selenium and Python bindings?

python selenium selenium-webdriver python-bindings

Reading here, there apparently used to be a RenderedWebElement class with a hover method. It, however, was exclusively made for Java (I have searched the Python bindings documentation to no avail) ...