因此,我有一个csv文件,我想导入,并希望根据第一列中的用户编号跳过导入csv文件中的重复行和原始行,我正在使用StringIO模块。我现在这样做的方式是不正确的,因为即使它跳过了重复的行,它仍然会导入我相信的原始行。跳过从csv导入重复行和原始行的最佳方法是什么?
def csv_import(stream):
ostream = StringIO()
headers = stream.readline()
ostream.write(headers)
seen_user_numbers = {}
for row in stream:
list_row = row.split(',')
user_number = list_row[0]
if user_number in seen_user_numbers:
seen_user_numbers.pop(user_number)
continue
seen_user_numbers[user_number] = True
ostream.write(row)
ostream.seek(0)
return ostream