有一个字符串
string= """"$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,3)","name":"Finance","$type":"voyager.identity.profile.Skill"},{"$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,22)","name":"Financial ["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,34)","name":"Due Diligence","name":"Strategy""""
我可以使用哪种reguar表达式来检索“名称”之后的值:获得尽职调查,财务和财务
我试过了
match = re.compile(r'"name"\:(.\w+)')
match.findall(string)
但它返回
['"Finance', '"Financial', '"Due', '"Financial', '"Strategy']
Due Diligence
被拆分,我希望两个单词合为一体。
答案 0 :(得分:1)
正则表达式无法检测到您的空格,因为/w
仅搜索非特殊字符。
"name"\:(.\w+\s*\w*)
说明任何可能的空格,并附加一个单词(对于三个单词不起作用,但会在你的情况下起作用)
"name"\:(.\w+\s*\w*"?)
在每个结尾处都会引用"
但不会获得财务报价。
Example
修改:修复了"财务
的第二个正则表达式答案 1 :(得分:0)
我会使用带有尾随引号的非饥饿.*?
表达式:
import re
string = """$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,3)","name":"Finance","$type":"voyager.identity.profile.Skill"},{"$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,22)","name":"Financial ["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,34)","name":"Due Diligence","name":"Strategy"""
# With the leading double quote
match = re.compile(r'"name"\:(".*?)["\[]')
a = match.findall(string)
print a
# Stripping out the leading double quote
match = re.compile(r'"name"\:"(.*?)["\[]')
b = match.findall(string)
print b
最终输出是:
['"Finance', '"Financial ', '"Due Diligence']
['Finance', 'Financial ', 'Due Diligence']