如何在字符串的10个字符之前和之后提取

时间:2020-03-30 17:34:44

标签: python python-3.x python-2.7

我需要从数据帧中提取精确匹配及其10个字符之前和之后的字符。


Search        Text                                                              parts
boy named     A boy named Alex.He lives in....                                A boy named ale
Jenny       Girl named Jennying as Jenny. This girl is really nice long.....  nnying as Jenny. This gir

我尝试了以下代码:

part= []

for index, row in df.iterrows():

        c=row['text'].lower().split().count(row['Search'].lower())
        idx = row['text'].lower().find(row['Search'].lower())

        if idx<10:           
            substr = row['text'][:idx+len(row['Search'])+10]
        else:
            subs = row['text'][idx-10:idx+len(row['Search'])+10]    
        part.append(substr)

df['parts'] = part

如果我使用split(),它将为单个单词完全匹配提供正确的结果,但是对于诸如“ boy named”之类的组合单词,其计数为零。

0 个答案:

没有答案