如何匹配熊猫系列中的关键词

时间:2019-12-23 03:59:37

标签: python pandas indexing

我想遍历名为主题的Dataframe列之一中的所有行,并从名为 keywords 的词典中查找多个关键词。如果字典中的键与“数据框”列中的单词匹配,我想将匹配键的字典值对添加到数据框中名为 Category 的新列中。下面的代码是我的第一个想法,即向列表中添加值,然后将该列表作为新列添加到我的Dataframe中,但是显然索引将不匹配。每当关键字在主题列中匹配时,是否可以将关键字值直接直接附加到数据框?

'''

   Under Linux, select() may report a socket file descriptor as "ready for
   reading", while nevertheless a subsequent read blocks.  This could  for
   example  happen  when  data  has arrived but upon examination has wrong
   checksum and is discarded.  There may be other circumstances in which a
   file  descriptor is spuriously reported as ready.  Thus it may be safer
   to use O_NONBLOCK on sockets that should not block.

'''

Image of example code

1 个答案:

答案 0 :(得分:0)

您可以先extract个加入的关键字,然后再map

keywords = {'BOR':'Broker of Record','New Vendor':'New Vendor Build'}

df = pd.DataFrame({"Category":["Something BOR","Something New Vendor","Something for nothing"]})

df["new"] = df["Category"].str.extract(f"({'|'.join(keywords)})",expand=False).map(keywords)

print (df)

                Category               new
0          Something BOR  Broker of Record
1   Something New Vendor  New Vendor Build
2  Something for nothing               NaN