我想删除括号和字符串之间的文本,但我不断收到此错误:sre_constants.error: unbalanced parenthesis at position 8
import re
s = 'TEXT1 ) something something TEXT2'
test= re.sub(r'(?<=^)\b).*(?=\bTEXT2)',' ',s)
print(test)
我想保留标识符,以便我的输出如下所示:
"TEXT1 ) TEXT2"
答案 0 :(得分:3)
主要问题是您没有逃避导致错误的)
。您可以使用后向引用和前向引用进行几乎所有的操作:
import re
s = 'TEXT1 ) something something TEXT2'
test= re.sub(r'(?<=\)).*(?=\bTEXT2)',' ',s)
print(test)
结果:
TEXT1 ) TEXT2
初始文字是什么都没关系
s = 'any text whatsoever!! ) something something TEXT2'
re.sub(r'(?<=\)).*(?=\bTEXT2)',' ',s)
# any text whatsoever!! ) TEXT2
答案 1 :(得分:2)
看起来像您需要的。
df.groupby(['Name','Speaker','StTime','EnTime'])['Text'].apply(' '.join).reset_index()
Example:
Name Speaker StTime Text EnTime
s1 tom 6.8 I would say 7.3
s1 tom 7.3 7.6
s1 tom 7.6 leap frog 8.3
s1 tom 8.3 9.2
s1 tom 9.2 a pig. 10.1
Name Speaker StTime Text EnTime
s1 tom 6.8 I would say leap frog a pig. 10.1
输出:
import re
s = 'TEXT1 ) something something TEXT2'
#test= re.sub(r'(^\w+\s*\))(.*?)(?=\bTEXT2)', r'\1 ',s)
test= re.sub(r'(\)).*(?=\bTEXT2)',r'\1 ',s)
print(test)
TEXT1 ) TEXT2
(错误:位置8处括号不平衡)