CSV文件中有两列:
oldCol1 = [1, 2, 3, 4, 5]
oldCol2 = ['A', 'B', 'C', 'D', 'E']
现在,我更新csv并添加新行
newCol1 = [1, 2, 3, 4, 5, 6]
newCol2 = ['A', 'B', 'C', 'D', 'E', 'A']
我只想获取新添加的元素。所以,我正在尝试:
newListCol1 = list(set(oldCol1).symmetric_difference(newCol1))
现在,我的问题是如何从第二列中获取新添加的元素?
#Here, I want to get two lists: [6] and ['A'].
感谢您的帮助!
更新:
新添加的元素可以在列表中的任何位置(不仅是结尾)-造成混乱的原因!
答案 0 :(得分:1)
如果您知道“新添加的元素”总是附加到列表的末尾,则只需从旧列表的长度开始进行切片。即
[]
答案 1 :(得分:0)
#if they can be anywhere
#mehtod 1
from collections import Counter
oldCol1 = [1, 2, 3, 4, 5]
oldCol2 = ['A', 'B', 'C', 'D', 'E']
newCol1 = [1, 2, 3, 4, 5, 6]
newCol1_1 = [1, 2, 3, 4, 5, 6, 6, 7, 7] #different example
newCol2 = ['A', 'B', 'C', 'D', 'E', 'A']
print(list((Counter(newCol1) - Counter(oldCol1)))) # returns a list of unique value
print(list((Counter(newCol2) - Counter(oldCol2))))
new_item_added_dict = Counter(newCol1_1) - Counter(oldCol1)
print( list(new_item_added_dict.elements())) # elements() returns an iterator
# if you want all the new values even duplicates like in newCol1_1
# ie if you want ans to be [6, 6, 7, 7] then use elements()
# else use list() if you just want unique value updates [6,7]
print( list(new_item_added_dict))
# output
# [6]
# ['A']
# [6, 6, 7, 7]
# [6, 7]
#---------------------------------------------------------------------
#method 2
from collections import defaultdict
oldCol1 = [1, 2, 3, 4, 5]
newCol1 = [1, 2, 3, 4, 5, 6] # -->[6]
# [1, 2, 3, 4, 5, 6, 5] --> [6,5]
new_item_list = []
oldlist_dict = defaultdict(lambda:0) #default value of key is 0 and with defualtdict you will not key error
for item in oldCol1:
oldlist_dict[item] += 1
for item in newCol1:
if item in oldlist_dict and oldlist_dict[item] > 0:
oldlist_dict[item] -=1
else:
# its a new item
new_item_list.append(item)
print(new_item_list)
#---------------------------------------------------------------------
#if new items are always appended ie added to end of old list
print(newCol1[len(oldCol1):])
print(newCol2[len(oldCol2):])
print(newCol1_1[len(oldCol1):])
答案 2 :(得分:-1)
您将需要获取第一个索引中不存在的索引,因此仅使用不带symmetric_difference的集合。使用enumerate()使索引更容易。
oldCol1 = [1, 2, 3, 4, 5]
oldCol2 = ['A', 'B', 'C', 'D', 'E']
newCol1 = [1, 2, 3, 4, 5, 6]
newCol2 = ['A', 'B', 'C', 'D', 'E', 'A']
indexes = [i for i, v in enumerate(newCol1) if v not in set(oldCol1)]
resultCol1 = [newCol1[i] for i in indexes]
resultCol2 = [newCol2[i] for i in indexes]
print(resultCol1, resultCol2)