我想替换一些","从我的文字。我不想替换所有","因为它是一个csv文件。因此,我写了正则表达式,它标识包含不需要的逗号的文本。我的regex101链接在下面 http://regex101.com/r/vF2iO5
它正确识别了我的文字
"_id" : "Java code PMD Complains about Cyclomatic Complexity , of 20", "tags" : "java performance tuning pmd", "title" : "Java code PMD Complains about Cyclomatic Complexity , of 20", "results" : true, "value" : true, "processed" : true, "tokenGenerated" : [ "java", "code", "pmd", "complains" ]
在密钥" _id"中识别文本的位置和"标题"其中包含逗号。现在我想用我的文本中的这两个逗号替换其他符号,例如" @@@"。我怎么能这样做?
我的正则表达式是
\"[(\w)(\s)]+ (\,) [(\w)(\s)]+\"
修改
使用re.sub在python中尝试如下。但我应该写什么来替换部分?
re.sub(r'(\"[(\w)(\s)]+\,[(\w)(\s)]+\")',r'\0',str(text))
答案 0 :(得分:0)
您可以使用re.sub
执行此操作:
import re
s = '''"_id" : "Java code PMD Complains about Cyclomatic Complexity , of 20", "tags" : "java performance tuning pmd", "title" : "Java code PMD Complains about Cyclomatic Complexity , of 20", "results" : true, "value" : true, "processed" : true, "tokenGenerated" : [ "java", "code", "pmd", "complains" ]'''
>>> print re.sub(r'(\"[(\w)(\s)]+ )(,)( [(\w)(\s)]+\")', '\\1@@@\\3', s)
"_id" : "Java code PMD Complains about Cyclomatic Complexity @@@ of 20", "tags" : "java performance tuning pmd", "title" : "Java code PMD Complains about Cyclomatic Complexity @@@ of 20", "results" : true, "value" : true, "processed" : true, "tokenGenerated" : [ "java", "code", "pmd", "complains" ]
答案 1 :(得分:0)
你可以使用sub
re.sub(r'(\“[(\ w)(\ s)] +)(,)([(\ w)(\ s)] + \”)','@@@', S) '“_id”:@@@,“tags”:“java performance tuning pmd”,“title”:@@@,“results” :true,“value”:true,“processed”:true,“tokenGenerated”:[“java”,“code” ,“pmd”,“抱怨”]'