我如何提取列与从excel文件创建的Dataframe中的特定值匹配的行?
以下是Dataframe中的几行:
Food Men Women
0 Total fruit 86.20 88.26
1 Apples, Total 89.01 89.66
2 Apples as fruit 89.18 90.42
3 Apple juice 88.78 88.42
4 Bananas 95.42 94.18
5 Berries 84.21 81.73
6 Grapes 88.79 88.13
这是我用来读取excel文件的代码,选择我需要的列并适当地重命名它们:
data1= pd.read_excel('USFoodCommodity.xls', sheetname='94-98 FAH', skiprows=76,skip_footer=142, parse_cols='A, H, K')
data1.columns = ['Food', 'Men', 'Women']
# Try 1: data1 = data1[data1['Food'].isin(['Total fruit']) == True] works
# Try 2: data1 = data1[data1['Food'].isin(['Apple, Total']) == True] doesn't work
# Try 3: data1 = data1.iloc[[1]] returns Apples, Total but not appropriate to use integer index
# Try 4: data1[data1['Food'] == 'Berries'] doesn't work
到目前为止,基于this,this或here等答案,我只能返回Food =“Total fruit”的第一个索引。当我尝试上面的其他方法时,我只获得列名,如:
Food Men Women
我是大熊猫的新手,无法看到我哪里出错了。为什么我可以提取食物= =总水果而不是其他任何东西的第一行?
答案 0 :(得分:2)
对我来说它很好用,也许有些空格问题 - 用strip
删除它们:
print (data1.Food.tolist())
['Total fruit', 'Apples, Total ', 'Apples as fruit',
'Apple juice', 'Bananas', ' Berries', 'Grapes']
data1['Food'] = data1['Food'].str.strip()
print (data1.Food.tolist())
['Total fruit', 'Apples, Total', 'Apples as fruit',
'Apple juice', 'Bananas', 'Berries', 'Grapes']
data2 = data1[data1['Food'].isin(['Total fruit'])]
print (data2)
Food Men Women
0 Total fruit 86.2 88.26
data3 = data1[data1['Food'].isin(['Apples, Total'])]
print (data3)
Food Men Women
1 Apples, Total 89.01 89.66
data3 = data1[data1['Food'].isin(['Berries'])]
print (data3)
Food Men Women
5 Berries 84.21 81.73
答案 1 :(得分:0)
使用此代码
data1= pd.read_excel('USFoodCommodity.xls', sheetname='94-98 FAH', skiprows=76,skip_footer=142, parse_cols='A, H, K')
list_of_strings_to_match = ['Total fruit', 'Berries', 'Grape']
for index, row in data1.iterrows():
if row['Food'] in list_of_strings_to_match:
print row
答案 2 :(得分:0)
这个问题可能很古老,但这是一种更简单直观的方法。
注意:此解决方案仅适用于pandas >= 0.13
。
您现在可以使用.query()
方法从数据框中选择列。
这很简单:
df.query('column == value') # The comparison operator can be anything.
例如,在您的情况下,您可以这样查询:
data1.query('Food == "Total Fruit"')
或
data1.query('Food == Berries')
要访问变量,请使用@
。
fruit = "berries"
data1.query('Food == @fruit')
您甚至可以使用&
来满足多种条件。
data1.query('condition1 == value1 & condition2 == value2')
希望有帮助。