如何在python中使用正则表达式搜索类似\ x60 \ xe2 \ x4b(表示表情符号)的字符串

时间:2018-05-12 05:05:31

标签: python regex emoticons

import re

string="b'@DerkGently @seanferg85 @Umbertobaggio @EL4JC and he already had Popular support.. most people know this already. A\xe2\x80\xa6 '"

print(re.findall(r"\x[0-9a-z]{2}",string))

findall()函数返回的列表为空:(

4 个答案:

答案 0 :(得分:2)

这里的问题是你的字符串是Python import 'package:flutter/material.dart'; void main() => runApp(new MyApp()); class MyApp extends StatelessWidget { @override Widget build(BuildContext context) { return new MaterialApp( home: new MyHomePage(), ); } } class MyHomePage extends StatefulWidget { MyHomePage(); @override _MyHomePageState createState() => new _MyHomePageState(); } class _MyHomePageState extends State<MyHomePage> { @override Widget build(BuildContext context) { return new Scaffold( body: new Column( children: <Widget>[ new Expanded( child: new ListView.builder( itemCount: 200, itemBuilder: (context, index) { return new ListTile( title: new Text("title $index"), ); }, ), ), ], ), ); } } 对象的Python表示,这几乎没用。

最有可能的是,你有一个bytes对象,如下所示:

bytes

...你将它转换为字符串,如下所示:

b = b'@DerkGently @seanferg85 @Umbertobaggio @EL4JC and he already had Popular support.. most people know this already. A\xe2\x80\xa6 '
不要这样做。相反,解码它:

s = str(b)

这将为您提供实际的字符,然后您可以轻松匹配,而不是尝试匹配字节表示的字符串表示中的字符,然后从结果中费力地重建实际字符。

但是,值得注意的是s = b.decode('utf-8') 不是表情符号,它是水平省略号字符\xe2\x80\xa6。如果这不是您想要的,那么在此之前您已经损坏了数据。

答案 1 :(得分:0)

不是正则表达式本身,但可以帮助您。

def emojis(s):
    return [c for c in s if ord(c) in range(0x1F600, 0x1F64F)]

print(emojis("hello world "))  # sample usage

答案 2 :(得分:0)

您需要re.compile(ur'A\xe2\x80\xa6',re.UNICODE)

编译Unicode正则表达式并将该模式​​匹配用于查找,查找所有,潜艇等。

答案 3 :(得分:0)

试试这个。我在您的问题中加入了字符串中的字符串以制作最终搜索字符串

import re

k = r"@DerkGently @seanferg85 @Umbertobaggio @EL4JC and he already had Popular support.. most people know this already. A\xe2\x80\xa6 for a string like \x60\xe2\x4b(indicating a emoticon) using regular expression in python"
print(k)
print()
p = re.findall(r"((\\x[a-z0-9]{1,}){1,})", k)
for each in p:
    print(each[0])

输出

@DerkGently @seanferg85 @Umbertobaggio @EL4JC and he already had Popular support.. most people know this already. A\xe2\x80\xa6 for a string like \x60\xe2\x4b(indicating a emoticon) using regular expression in python

\xe2\x80\xa6
\x60\xe2\x4b