Pandas-在列中拆分文本并在行中搜索

时间:2016-05-24 10:44:40

标签: python pandas

此问题与此历史有关:Link

这是一个json格式表:

ID Title
19 I am doing great
25 [Must fix problem] Stomach not well
31 [Not-so-urgent] Wash cloths
498 [VERY URGENT] Pay your rent
517 Landlord wants you to pay your rent tomorrow
918 Girlfriend wants to help you to pay rent if you take her out
1000 [Always reproducible issue] Room partner dont want to pay any rent, he is out of cash

我做了这个

在:     selected_row_title = df.loc [df ['id'] == 4] [“title”]

输出:

[VERY URGENT] Pay your rent

现在,通过使用Python Pandas,我正在尝试编写一个函数:

get_matching_rows(selected_row_title )

输出

ID 498 has pay your rent 
ID 517 has pay your rent
ID 918 has pay rent
ID 1000 has pay rent

我一直在撕扯我的头发,我真的需要一些帮助,至少是一次如何实现这一点的指导。感谢任何投入。

1 个答案:

答案 0 :(得分:1)

我认为您可以str.replace使用str.contains

s = "[VERY URGENT] Pay your rent"

#replace all [] in column title
tit = df.Title.str.replace(r'[\[\]]', '')
print (tit)

0                                     I am doing great
1                    Must fix problem Stomach not well
2                            Not-so-urgent Wash cloths
3                            VERY URGENT Pay your rent
4         Landlord wants you to pay your rent tomorrow
5    Girlfriend wants to help you to pay rent if yo...
6    Always reproducible issue Room partner dont wa...
Name: Title, dtype: object

#search one of word of string s (logical or is |)
mask = tit.str.contains(s.replace(' ', '|'))
print (mask)
0    False
1    False
2     True
3     True
4     True
5     True
6     True
Name: Title, dtype: bool
#select all ID by condition
selected_row_title = df.loc[mask, 'ID']
print (selected_row_title)
2      31
3     498
4     517
5     918
6    1000
Name: ID, dtype: int64