如何实现类似选择的功能

时间:2017-02-19 00:19:15

标签: python

我在python中得到了一个数据集,它的结构就像

Tree Species  number of trunks
------------------------------
Acer rubrum          1
Quercus bicolor      1
Quercus bicolor      1
aabbccdd             0

我有一个问题,我可以实现类似于

的功能
Select sum(number of trunks)
from trees.data['Number of Trunks']
where x = trees.data["Tree Species"]
group by trees.data["Tree Species"]

在python中? x是一个包含五个元素的数组:

x = array(['Acer rubrum', 'Acer saccharum', 'Acer saccharinum',
'Quercus rubra', 'Quercus bicolor'], dtype='<U16')

我想要做的是将x中的每个元素映射到trees.data [&#34; Tree Species&#34;]并计算中继数的总和,它应该返回一个数组

array = (sum_num(Acer rubrum), sum_num(Acer saccharum), sum_num(Acer saccharinum), 
sum_num(Acer Quercus rubra), sum_num(Quercus bicolor))

2 个答案:

答案 0 :(得分:2)

你想看看Python Pandas吗?这将允许你做类似

的事情
property

请注意,df.groupby('Tree Species')['Number of Trunks'].sum() 是您在数据框中读取的变量名称。我建议你看看pandas和df函数。

答案 1 :(得分:1)

您可以这样做:

import pandas as pd 
df = pd.DataFrame()
tree_species = ["Acer rubrum", "Quercus bicolor", "Quercus bicolor", "aabbccdd"]
no_of_trunks = [1,1,1,0]
df["Tree Species"] = tree_species
df["Number of Trunks"] = no_of_trunks
df.groupby('Tree Species').sum() #This will create a pandas dataframe
df.groupby('Tree Species')['Number of Trunks'].sum() #This will create a pandas series. 

你也可以通过使用字典来做同样的事情:

tree_species = ["Acer rubrum", "Quercus bicolor", "Quercus bicolor", "aabbccdd"]
no_of_trunks = [1,1,1,0]    
d = {}
for key, trunk in zip(tree_species, no_of_trunks):
    if not key in d.keys():
        d[key] = 0
    d[key] += trunk         
print(d)