python parsing csv file
I am parsing a csv file where the first line is the header. I want to sum the amount column according to dates, but am getting an error message. To debug I am checking if the column is a digit as well as if it is a string according to the error message - and it is both. What could be the reason for this?
def parseDataFromFile(self,f): fh = open(f,'r') s = 0 for line in fh: #parsing the line according to comma and stripping the '\n' char year,month,day,amount = line.strip('\n').split(',') #checking the header row, could check if was first row as well - would be faster if (amount == "Amount"): continue #just for the debug checks #here is the question if isinstance(amount,str): print "amount is a string" #continue if amount.isdigit: print "amount is a digit" #sum on the amount column s = s + amount
Output: amount is a string amount is a digit amount is a string amount is a digit
s = s + amount TypeError: unsupported operand type(s) for +: 'int' and 'str'
Your problem is that s is an integer, you initialize it to 0. Then you try to add a string to it. amount is always a string. You do nothing to turn your number-like data into actual numbers, it will always be a string.
If you expect amount to be a number, then use:
s += float(amount)
PS: you should use the csv module in the stdlib for reading CSV files.
if amount.isdigit: print "amount is a digit"
will always print "amount is a digit" because you're not calling the method (it should be if amount.isdigit():).
You can be sure that any field you get by splitting a line from a CSV file will be a string, you'll need to convert it to an int first:
s = s + int(amount)
Something like?: (assuming column headers are "Year", "Month", "Day", "Amount")
from collections import defaultdict import csv sum_by_ym = defaultdict(float) with open('input_file.csv') as f: for row in csv.DictReader(f): sum_by_ym[(row['Year'], row['Month'])] += int(float['Amount']) print sum_by_ym
s is an int, and amount is a string representation of a number, so change s = s + amount to s += int(amount)