Python:删除重复项并指示索引

时间:2014-06-16 21:16:03

标签: python indexing duplicates

下面是3D点组的元组(9分)

f = [[10, 20, 0], 
    [40, 20, 30], 
    [20, 0, 30], 
    [10, 10, 0], 
    [30, 10, 10], 
    [20, 0, 30], 
    [20, 10, 20], 
    [10, 10, 0],
    [20, 0, 30]]

每个点对应一个指示点类型(假设)的数字(索引)

ic=[1,2,3,2,1,3,2,3,1]

因此,前一个元组可以表示为

f = [[10, 20, 0, 1], 
    [40, 20, 30, 2], 
    [20, 0, 30, 3], 
    [10, 10, 0, 2], 
    [30, 10, 10, 1], 
    [20, 0, 30, 3], 
    [20, 10, 20, 2], 
    [10, 10, 0, 3],
    [20, 0, 30, 1]]

这是我的代码:

uniq = []
dup = []
count = 0
for i, j, k  in f:
    if not [f.index([i,j,k]),i,j,k] in uniq:
        uniq.append([count,i,j,k])
    else:
        dup.append([count,i,j,k,"duplicate"])
    count += 1
uniq.extend(dup)
print(uniq)

for i,j in enumerate(uniq):
    j.append(ic[j[0]])
print(unique)

我想获得的结果如下所示:

独特的部分:

index       point         equivalent points    index for same point
  0      [10, 20, 0, 1]           1                   [1]
  1      [40, 20, 30, 2]          1                   [2]
  2      [20, 0, 30, 3]           3                 [3,3,1]
  3      [10, 10, 0, 2]           2                  [2,3]
  4      [30, 10, 10, 1]          1                   [1]
  6      [20, 10, 20, 2]          1                   [2]

重复部分:

index       point         Duplicate or not
  5      [20, 0, 30, 3]       duplicate
  7      [10, 10, 0, 3]       duplicate
  8      [20, 0, 30, 1]       duplicate

我的代码旨在实现挑选重复点的功能,并在列表中指明其索引。另外,我还需要实现功能,显示我的独特部分中有多少等效点以及这些等效点的索引。

我该如何修改它?

2 个答案:

答案 0 :(得分:1)

我不确定我会在你得到你的索引点的地方跟随,但是这里我是如何计算重复的。首先,您需要计算不可变数据类型,因此将子列表更改为实际元组,并使用collections.Counter对它们进行计数:

import pprint # do your imports first
import collections


f = [[10, 20, 0], [40, 20, 30], [20, 0, 30], [10, 10, 0], [30, 10, 10], [20, 0, 30], [20, 10, 20], [10, 10, 0], [20, 0, 30]]
t = [tuple(i) for i in f] # we need immutable datatypes to count

counts = collections.Counter(t)
pprint.pprint(counts)

打印

{(10, 10, 0): 2,
 (10, 20, 0): 1,
 (20, 0, 30): 3,
 (20, 10, 20): 1,
 (30, 10, 10): 1,
 (40, 20, 30): 1}

正如您可能知道的那样,Counter只是一个子类dict,并且拥有所有正常的dict方法。

让你的独特和愚蠢:

uniques = [k for k, v in counts.items() if v == 1]

返回

[(10, 20, 0), (30, 10, 10), (40, 20, 30), (20, 10, 20)]

dupes = [k for k, v in counts.items() if v > 1]

返回

[(20, 0, 30), (10, 10, 0)]

答案 1 :(得分:1)

for j in uniq+dup:
    if "duplicate" not in j:
        j += ic[j[0]],f.count(j[1:4]), [ic[j[0]]]
    else:
        j.append(ic[j[0]])    

for i in dup:
    for j in uniq:
        if i[1:4] == j[1:4]:
            j[-1].append(i[-1])

[[5, 20, 0, 30, 'duplicate', 3], [7, 10, 10, 0, 'duplicate', 3], [8, 20, 0, 30, 'duplicate', 1]]

[[0, 10, 20, 0, 1, 1, [1]], [1, 40, 20, 30, 2, 1, [2]], [2, 20, 0, 30, 3, 3, [3, 3, 1]], [3, 10, 10, 0, 2, 2, [2, 3]], [4, 30, 10, 10, 1, 1, [1]], [6, 20, 10, 20, 2, 1, [2]]]

这会将计数添加到每个子列表而不更改原始结构。