Question

CSV文件中有两列：

oldCol1 = [1, 2, 3, 4, 5]

oldCol2 = ['A', 'B', 'C', 'D', 'E']

现在，我更新csv并添加新行

newCol1 = [1, 2, 3, 4, 5, 6]

newCol2 = ['A', 'B', 'C', 'D', 'E', 'A']

我只想获取新添加的元素。所以，我正在尝试：

newListCol1 = list(set(oldCol1).symmetric_difference(newCol1))

现在，我的问题是如何从第二列中获取新添加的元素？

#Here, I want to get two lists: [6] and ['A'].

感谢您的帮助！

更新：
新添加的元素可以在列表中的任何位置（不仅是结尾）-造成混乱的原因！

Answer 1

如果您知道“新添加的元素”总是附加到列表的末尾，则只需从旧列表的长度开始进行切片。即

[]

Answer 2

#if they can be anywhere

#mehtod 1
from collections import Counter 

oldCol1 = [1, 2, 3, 4, 5]

oldCol2 = ['A', 'B', 'C', 'D', 'E']

newCol1 = [1, 2, 3, 4, 5, 6]

newCol1_1 = [1, 2, 3, 4, 5, 6, 6, 7, 7] #different example

newCol2 = ['A', 'B', 'C', 'D', 'E', 'A']

print(list((Counter(newCol1) - Counter(oldCol1)))) # returns a list of unique value
print(list((Counter(newCol2) - Counter(oldCol2))))


new_item_added_dict = Counter(newCol1_1) - Counter(oldCol1)
print( list(new_item_added_dict.elements())) # elements() returns an iterator
# if you want all the new values even duplicates like in newCol1_1 
# ie if you want ans to be [6, 6, 7, 7] then use elements()

# else use list() if you just want unique value updates [6,7]
print( list(new_item_added_dict))

 # output
 # [6]
 # ['A']
 # [6, 6, 7, 7]
 # [6, 7]

#--------------------------------------------------------------------- 

#method 2
from collections import defaultdict
oldCol1 = [1, 2, 3, 4, 5]
newCol1 = [1, 2, 3, 4, 5, 6]  # -->[6]
# [1, 2, 3, 4, 5, 6, 5] --> [6,5]

new_item_list = []
oldlist_dict = defaultdict(lambda:0) #default value of key is 0 and with defualtdict you will not key error

for item in oldCol1:
    oldlist_dict[item] += 1

for item in newCol1:
    if item in oldlist_dict and oldlist_dict[item] > 0:
        oldlist_dict[item] -=1
    else:
        # its a new item 
        new_item_list.append(item)

print(new_item_list)


#--------------------------------------------------------------------- 

#if new items are always appended ie added to end of old list
print(newCol1[len(oldCol1):])  

print(newCol2[len(oldCol2):])

print(newCol1_1[len(oldCol1):])

Answer 3

您将需要获取第一个索引中不存在的索引，因此仅使用不带symmetric_difference的集合。使用enumerate()使索引更容易。

oldCol1 = [1, 2, 3, 4, 5]

oldCol2 = ['A', 'B', 'C', 'D', 'E']

newCol1 = [1, 2, 3, 4, 5, 6]

newCol2 = ['A', 'B', 'C', 'D', 'E', 'A']

indexes = [i for i, v in enumerate(newCol1) if v not in set(oldCol1)]

resultCol1 = [newCol1[i] for i in indexes]
resultCol2 = [newCol2[i] for i in indexes]

print(resultCol1, resultCol2)

使用对称差异获取已删除元素的索引

3 个答案: