Question

我有一个数据框，

DF,
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
        2       Thanks for reading
Ram     1       Ram is one of the good cricket player
ganesh  1       good driver

和一个清单，

my_list=["one"]

 I tried mask=df["Description"].str.contains('|'.join(my_list),na=False)

但它给出了，

 output_DF.
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
Ram     1       Ram is one of the good cricket player

My desired output is,
desired_DF,
Name    Stage   Description
Sri     1       Sri is one of the good singer in this two
        2       Thanks for reading
Ram     1       Ram is one of the good cricket player

必须考虑阶段列，我想要与描述相关联的所有行。

Answer 1

我认为你需要：

print (df)
     Name  Stage                                Description
0     Sri      1  Sri is one of the good singer in this two
1              2                         Thanks for reading
2     Ram      1      Ram is one of the good cricket player
3  ganesh      1                                good driver

#replace empty or whitespaces by previous value
df['Name'] = df['Name'].mask(df['Name'].str.strip() == '').ffill()
print (df)
     Name  Stage                                Description
0     Sri      1  Sri is one of the good singer in this two
1     Sri      2                         Thanks for reading
2     Ram      1      Ram is one of the good cricket player
3  ganesh      1                                good driver

#get all names by condition
my_list = ["one"]
names=df.loc[df["Description"].str.contains("|".join(my_list),na=False), 'Name']
print (names)
0    Sri
2    Ram
Name: Name, dtype: object

#select all rows contains names
df = df[df['Name'].isin(names)]
print (df)
  Name  Stage                                Description
0  Sri      1  Sri is one of the good singer in this two
1  Sri      2                         Thanks for reading
2  Ram      1      Ram is one of the good cricket player

Answer 2

它似乎正在寻找一个＆＃34;一个＆＃34;在dataframe的Description字段中并返回匹配的描述。

如果你想要第三行，你必须为第二次匹配添加一个数组元素

例如。＆＃39;感谢＆＃39;所以像my_list = [＆＃34; one＆＃34;，＆＃34;谢谢＆＃34;]

使用python中的pandas将关键字映射到dataframe列

2 个答案: