删除CSV中某些列的字段

时间:2019-05-16 16:28:41

标签: python csv

我需要在第8列之后的每个记录中查找重复值,如果有的话,将其删除。每个记录中的前两个值可以相同,并且不能删除。

with open("User.csv") as file:
    reader = csv.reader(file)
    for row in reader:
        for column in reader:
            print(column)

上面的代码在控制台中读取文件:

['257488', '257488', '3', '1234', '', '1', '1', '', '160000', '', '0', '', '']
['257488', '257488', '3', '1234', '', '1', '1', '', '160000', '', '0', '', '']
['270076', '270076', '2', '1234', '', '1', '1', '', '40000', '270076CASH', '270076CASH', '', '']
['270076', '270076', '2', '1234', '', '1', '1', '', '40000', '270076CASH', '0', '', '']

注意:上面第3行中的第二个值“ 270076CASH”将被删除。它应该检查并删除其余行。

1 个答案:

答案 0 :(得分:0)

解决此问题的一种方法是将所有项目都放在第8列之后,并仅存储唯一的项目(通过标准for循环和if语句进行检查)。然后,将这些唯一的项目放在第8列之后,然后将它们重新插入到记录中。下面是对此的一个相当受评论的实现:

with open( 'User.csv' ) as file:
    reader = csv.reader( file )
    for record in reader:

        # store all items after column 8 into a temporary list
        subset_list = record[9:]

        # keep track of unique (non-duplicate) items in the subset
        unique_subset = []

        # loop through every item of the subset
        for item in subset_list:

            # check if the current item already exists, then append
            if not item in unique_subset or item == '':
                unique_subset.append( item )

        # after looping through subset, reinsert the unique entries into column
        new_record = record[:9] + unique_subset[:]

        # verify operations
        print( new_record )