正则表达式在文本中查找字母和数字单词

时间:2018-05-22 19:56:04

标签: python regex

我有这样的文字:

due to previous assess c6c587469 and 4ec0f198
nearest and with fill station in the citi
becaus of our satisfact in the d4a29a already
averaging my thoughts on e977f33588f react to

我要删除所有“alpha& numeric”字样

在输出中,我想要

due to previous assess and 
nearest and with fill station in the citi
becaus of our satisfact in the already
averaging my thoughts on react to

我试过这个,但它不起作用..

df_colum = df_colum.str.replace('[^A-Za-z0-9\s]+', '')

任何正则表达式专家?

由于

4 个答案:

答案 0 :(得分:1)

尝试使用此正则表达式:

func chartValueSelected(_ chartView: ChartViewBase, entry: ChartDataEntry, highlight: Highlight) {
    let translation1 = CGAffineTransform(scaleX: 1.5, y: 1.5)
    let translation2 = CGAffineTransform(translationX: 1.0, y: 1.0)
    UIView.animate(withDuration: 0.4, animations: { in
        entry.transform = translation1
    }) { didComplete in
        UIView.animate(withDuration: 0.4, animations: {
            entry.transform = translation2
        })
    }
}

答案 1 :(得分:0)

这是没有正则表达式的一种方式:

def parser(x):
    return ' '.join([i for i in x.split() if not any(c.isdigit() for c in i)])

df['text'] = df['text'].apply(parser)

print(df)

                                        text
0                 due to previous assess and
1  nearest and with fill station in the citi
2     becaus of our satisfact in the already
3          averaging my thoughts on react to

答案 2 :(得分:0)

这个应该有效:

df_colum = df_colum.str.replace('(?:[0-9][^ ]*[A-Za-z][^ ]*)|(?:[A-Za-z][^ ]*[0-9][^ ]*)', '')

可以找到正则表达式的说明here

答案 3 :(得分:0)

您可以查找数字符合字母\d[a-z][a-z]\d的位置,然后匹配结束:

(?i)\b(?:[a-z]+\d+|\d+[a-z]+)\w*\b *

Live demo

  • (?i)启用不区分大小写
  • (?:...)构建非捕获组
  • \b表示字边界

Python代码:

re.sub(r"\b(?:[a-z]+\d+|\d+[a-z]+)\w*\b *", "", str)