在datafame中的列表中查找元素

时间:2019-07-11 05:16:49

标签: python pandas dataframe

我有一个数据框“ df1”:

adj           response

beautiful    ["She's a beautiful girl/woman, and also a good teacher."]
good         ["She's a beautiful girl/woman, and also a good teacher."]
hideous      ["This city is hideous, let's move to the countryside."]

这是对象列表:

object=["girl","teacher","city","countryside","woman"]

代码:

df1['response_split']=df1['response'].str.split(",")

我将其拆分后,数据帧将如下所示:

adj           response_split

beautiful    ["She's a beautiful girl/woman", " and also a good teacher."]
good         ["She's a beautiful girl/woman", " and also a good teacher."]
hideous      ["This city is hideous", " let's move to the countryside."]

我想添加另一列“ response_object”,如果他们找到响应的adj,则从列表对象中找到其对象:预期结果

adj           response_split                                               response_object

beautiful    ["She's a beautiful girl/woman", " and also a good teacher."]        girl
beautiful    ["She's a beautiful girl/woman", " and also a good teacher."]        woman
good         ["She's a beautiful girl/woman", " and also a good teacher."]        teacher
hideous      ["This city is hideous", " let's move to the countryside."]          city

代码:

for i in df1['response_split']:
    if df1['adj'] in i:
        if any(x in i and x in object):
            match = list(filter(lambda x: x in i, object))
            df1['response_object']=match

它显示 ValueError

1 个答案:

答案 0 :(得分:3)

第一个object是有效的python builtins(代码字),因此最好不要将其用于变量,这里更改为L

L=["girl","teacher","city","countryside","woman"]

然后将zipadj分隔的列,按元组循环,按L中的值循环,如果两者都与inand匹配,则匹配:< / p>

df1['response_split']=df1['response'].str.split(",")
L1 = [(a, b, o) for a, b in zip(df1['adj'], df1['response_split']) 
                for r in b 
                for o in L 
                if (o in r) and (a in r)]

应重写为循环的内容:

df1['response_split']=df1['response'].str.split(",")

L1 = []
for a, b in zip(df1['adj'], df1['response_split']):
    for r in b:
        for o in L:
            if (o in r) and (a in r):
                L1.append((a, b, o))

最后创建DataFrame构造函数:

df2 = pd.DataFrame(L1, columns=['adj','response_split','response_object'])
print (df2)
         adj                                     response_split  \
0  beautiful  [She's a beautiful girl/woman,  and also a goo...   
1  beautiful  [She's a beautiful girl/woman,  and also a goo...   
2       good  [She's a beautiful girl/woman,  and also a goo...   
3    hideous  [This city is hideous,  let's move to the coun...   

  response_object  
0            girl  
1           woman  
2         teacher  
3            city