我有一个如下所示的numpy数组:
[
('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo1.jpg', []),
('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo2.jpg', []),
('{893EE51E-0CD1-4C06-B672-365EECA26C63}', 'image/jpeg', 'Photo1.jpg', []),
('{893EE51E-0CD1-4C06-B672-365EECA26C73}', 'image/jpeg', 'Photo1.jpg', [])
]
如何在每个元组的“位置0”处找到唯一值?理想情况下,我想输出看起来像这样的数组(或列表):
[
'{893EE51E-0CD1-4C06-B672-365EECA26C33}',
'{893EE51E-0CD1-4C06-B672-365EECA26C63}',
'{893EE51E-0CD1-4C06-B672-365EECA26C73}'
]
答案 0 :(得分:1)
从显示中重新创建结构化数组:
In [241]: _ = np.array([
...: ('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo1.jpg', []),
...: ('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo2.jpg', []),
...: ('{893EE51E-0CD1-4C06-B672-365EECA26C63}', 'image/jpeg', 'Photo1.jpg', []),
...: ('{893EE51E-0CD1-4C06-B672-365EECA26C73}', 'image/jpeg', 'Photo1.jpg', [])
...: ],dtype='U50,U20,U20,O')
Out[241]:
array([('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo1.jpg', list([])),
('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo2.jpg', list([])),
('{893EE51E-0CD1-4C06-B672-365EECA26C63}', 'image/jpeg', 'Photo1.jpg', list([])),
('{893EE51E-0CD1-4C06-B672-365EECA26C73}', 'image/jpeg', 'Photo1.jpg', list([]))],
dtype=[('f0', '<U50'), ('f1', '<U20'), ('f2', '<U20'), ('f3', 'O')])
选择第一个字段:
In [242]: _['f0']
Out[242]:
array(['{893EE51E-0CD1-4C06-B672-365EECA26C33}',
'{893EE51E-0CD1-4C06-B672-365EECA26C33}',
'{893EE51E-0CD1-4C06-B672-365EECA26C63}',
'{893EE51E-0CD1-4C06-B672-365EECA26C73}'], dtype='<U50')
对此应用unique
:
In [243]: np.unique(_)
Out[243]:
array(['{893EE51E-0CD1-4C06-B672-365EECA26C33}',
'{893EE51E-0CD1-4C06-B672-365EECA26C63}',
'{893EE51E-0CD1-4C06-B672-365EECA26C73}'], dtype='<U50')
答案 1 :(得分:1)
结合使用set()和列表理解:
x = [
('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo1.jpg', []),
('{893EE51E-0CD1-4C06-B672-365EECA26C33}', 'image/jpeg', 'Photo2.jpg', []),
('{893EE51E-0CD1-4C06-B672-365EECA26C63}', 'image/jpeg', 'Photo1.jpg', []),
('{893EE51E-0CD1-4C06-B672-365EECA26C73}', 'image/jpeg', 'Photo1.jpg', [])
]
y = set(i[0] for i in x)
y
{'{893EE51E-0CD1-4C06-B672-365EECA26C63}',
'{893EE51E-0CD1-4C06-B672-365EECA26C73}',
'{893EE51E-0CD1-4C06-B672-365EECA26C33}'}
答案 2 :(得分:0)
切片第一列([:, 0]
)后使用np.unique
>>> np.unique(arr[:,0])
array(['{893EE51E-0CD1-4C06-B672-365EECA26C33}',
'{893EE51E-0CD1-4C06-B672-365EECA26C63}',
'{893EE51E-0CD1-4C06-B672-365EECA26C73}'], dtype=object)