# pandas count over multiple columns

I have a dataframe looking like this

Measure1 Measure2 Measure3 ... 0 1 3 1 3 2 3 0

I'd like to count the occurrences of the values over the columns to produce:

Measure Count Percentage 0 2 0.25 1 2 0.25 2 1 0.125 3 3 0.373

With

outcome_measure_count = cdss_data.groupby(key_columns=['Measure1'],operations={'count': agg.COUNT()}).sort('count', ascending=True)

I only get the first column (actually using graphlab package, but I'd prefer pandas)

Could someone help me?

## Answers

You can generate the counts by flattening the df using ravel and value_counts, from this you can construct the final df:

In [230]: import io import pandas as pd t="""Measure1 Measure2 Measure3 0 1 3 1 3 2 3 0 0""" df = pd.read_csv(io.StringIO(t), sep='\s+') df Out[230]: Measure1 Measure2 Measure3 0 0 1 3 1 1 3 2 2 3 0 0 In [240]: count = pd.Series(df.squeeze().values.ravel()).value_counts() pd.DataFrame({'Measure': count.index, 'Count':count.values, 'Percentage':(count/count.sum()).values}) Out[240]: Count Measure Percentage 0 3 3 0.333333 1 3 0 0.333333 2 2 1 0.222222 3 1 2 0.111111

I inserted a 0 just to make the df shape correct but you should get the point

In [68]: df=DataFrame({'m1':[0,1,3], 'm2':[1,3,0], 'm3':[3,2, np.nan]}) In [69]: df Out[69]: m1 m2 m3 0 0 1 3.0 1 1 3 2.0 2 3 0 NaN In [70]: df=df.apply(Series.value_counts).sum(1).to_frame(name='Count') In [71]: df Out[71]: Count 0.0 2.0 1.0 2.0 2.0 1.0 3.0 3.0 In [72]: df.index.name='Measure' In [73]: df Out[73]: Count Measure 0.0 2.0 1.0 2.0 2.0 1.0 3.0 3.0 In [74]: df['Percentage']=df.Count.div(df.Count.sum()) In [75]: df Out[75]: Count Percentage Measure 0.0 2.0 0.250 1.0 2.0 0.250 2.0 1.0 0.125 3.0 3.0 0.375