假设我在numpy中创建结构化数组:
name = ['Tom' , 'Jim', 'Alice', 'Alice', 'Greg']
height = [188, 160, 160, 157, 180]
pet = ['dog', 'cat', 'fish', 'dog', 'cat']
a = np.zeros(len(name), dtype=[('name', 'U30'), ('height', 'i'), ('pet', 'U30')])
a['name'] = name
a['height'] = height
a['pet'] = pet
numpy中有没有办法提取满足某些条件的行。例如:
'height' == 160 and 'pet' == 'cat'
答案 0 :(得分:4)
IIUC,以下是使用numpy
a[(a['height'] == 160) & (a['pet'] == 'cat')]
返回:
array([('Jim', 160, 'cat')],
dtype=[('name', '<U30'), ('height', '<i4'), ('pet', '<U30')])
如果您只想获得满足条件的索引,请使用numpy.where
:
np.where((a['height'] == 160) & (a['pet'] == 'cat'))
# (array([1]),)
<强>买者强>:
话虽如此,numpy
可能不适合您的目的。要了解原因,请考虑数组a
的外观:
>>> a
array([('Tom', 188, 'dog'), ('Jim', 160, 'cat'), ('Alice', 160, 'fish'),
('Alice', 157, 'dog'), ('Greg', 180, 'cat')],
dtype=[('name', '<U30'), ('height', '<i4'), ('pet', '<U30')])
有点难以阅读......
考虑使用pandas
来组织表格数据:
import pandas as pd
df = pd.DataFrame({'name':name, 'height':height, 'pet':pet})
>>> df
height name pet
0 188 Tom dog
1 160 Jim cat
2 160 Alice fish
3 157 Alice dog
4 180 Greg cat
>>> df.loc[(df.height==160) & (df['pet'] == 'cat')]
height name pet
1 160 Jim cat