我试图在最后一个elif块中获得与品牌和制造商相关的相同值(例如brand == J.R.Watkins和Manufacturer == J.R.Watkins)。但是它给出了错误:
ValueError:DataFrame的真值不明确。使用a.empty,a.bool(),a.item(),a.any()或a.all()。 我的代码是:
import csv
import pandas as pd
import sys
class sample:
def create_df(self, f):
self.z=pd.read_csv(f)
def get_resultant_df(self, list_cols):
self.data_frame = self.z[list_cols[:]]
def process_df(self, df, conditions):
resultant_df = self.data_frame
if conditions[2] == 'equals':
new_df=resultant_df[resultant_df[conditions[1]] == conditions[3]]
return new_df
elif conditions[2] == 'contains':
new_df = resultant_df[resultant_df[conditions[1]].str.contains(conditions[3])]
return new_df
elif conditions[2] == 'not equals':
new_df = resultant_df[resultant_df[conditions[1]] != conditions[3]]
return new_df
elif conditions[2] == 'startswith':
new_df = resultant_df[resultant_df[conditions[1]].str.startswith(conditions[3])]
return new_df
elif conditions[2] == 'in':
new_df = resultant_df[resultant_df[conditions[1]].isin(resultant_df[conditions[3]])]
return new_df
elif conditions[2] == 'not in':
new_df = resultant_df[~resultant_df[conditions[1]].isin(resultant_df[conditions[3]])]
return new_df
elif conditions[2]=='group':
new_df=list(resultant_df.groupby(conditions[0])[conditions[1]])
return new_df
elif conditions[2]=='specific':
new_df=resultant_df.loc[resultant_df[conditions[0]]==conditions[8]]
return new_df
elif conditions[2]=='same':
if(resultant_df.loc[(resultant_df[conditions[0]]==conditions[8]) & (resultant_df[conditions[1]]==conditions[8])]).all():
new_df=resultant_df
return new_df
if __name__ == '__main__':
sample = sample()
sample.create_df("/home/purpletalk/GrammarandProductReviews.csv")
df = sample.get_resultant_df(['brand', 'reviews.id','manufacturer','reviews.title','reviews.username'])
new_df = sample.process_df(df, ['brand','manufacturer','same','manufacturer', 'size', 'equal',8,700,'J.R. Watkins'])
print new_df['brand']
答案 0 :(得分:1)
我正在尝试获取与品牌和 相同的制造商(例如,品牌== J.R。Watkins和 制造商== J.R。沃特金斯)
您的逻辑过于复杂。只需应用一个过滤器:
df = df[(df['brand'] == 'J.R. Watkins') & (df['manufacturer'] == 'J.R.Watkins')]
您不需要pd.DataFrame.all()
,这似乎就是您要尝试的。您也不需要内部if
语句:如果没有匹配项,您将有一个空的数据框。