How to loop through a dataframe, create a new column and append values to it in python

I have the following problem. I have a dataframe with several columns, one of those contains strings as values. I want to loop through this column, change those values and save the changed values in a new column.

The code I have written so far looks like this:

def get_classes(x):    
    for index, string in df['column'].iteritems():
        listi = string.split(',')
        Classes=[]

        for value in listi:
            count=listi.count(value)
            if count >= 3: 
                Classes.append(value)

        Unique=(',').join(sorted(list(set(Classes))))
        df['NewColumn']=Unique


End.apply(get_classes)

It loops through the rows of df['column'], splitting the string at each ,(creating a list called listi) and creates an empty list called classes. It then counts each value in listi and appends it to Classes if it occures at least three times in the list. The finished list is then sorted and set(), so that all objects in the list are unique, and finally joined at comma to a string again. Then I want to append this unique list of value in a new column, at the same index position as the row value the changed value is derived from. As example:

df
  column    NewColumn
0 A,A,A,C   A 
1 C,B,C,C   C
2 B,B,B,B   B

My code seems to work fine when I do print Unique instead of df['NewColumn']=Unique, as it then prints all the transformed values. If I execute the code like in my example however, the NewColumn of the dataframe is completely filled with the same value, which seems to correspond to the original value of the last row in the df. Can someone explain to me what the problem here is?

Answers


You can use powerfull Counter from Collections:

from collections import Counter

foo = lambda x: ','.join(sorted([k for k,v in Counter(x).iteritems() if v>=3]))

df['new'] = df['column'].str.split(',').map(foo)


#In [33]: df
#Out[33]:
#    column NewColumn new
#0  A,A,A,C         A   A
#1  C,B,C,C         C   C
#2  B,B,B,B         B   B

Need Your Help

Can I place a long time operation in Service class when I use AlarmManager?

android

I use AlarmManager to handle a schedule task, the function DoSomething() maybe spend a long time, can I place the function in Service class? Thanks!

Taking multiple snapshots at the same time in Python on RPi

python multithreading raspberry-pi photo

I am currently trying to get a Raspberry Pi take 2 photos at the same time, using 2 web cams.