Question

有一个字符串

string= """"$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,3)","name":"Finance","$type":"voyager.identity.profile.Skill"},{"$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,22)","name":"Financial ["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,34)","name":"Due Diligence","name":"Strategy""""

我可以使用哪种reguar表达式来检索“名称”之后的值：获得尽职调查，财务和财务

我试过了

match = re.compile(r'"name"\:(.\w+)') match.findall(string)

但它返回

['"Finance', '"Financial', '"Due', '"Financial', '"Strategy'] Due Diligence被拆分，我希望两个单词合为一体。

Answer 1

正则表达式无法检测到您的空格，因为/w仅搜索非特殊字符。

"name"\:(.\w+\s*\w*)说明任何可能的空格，并附加一个单词（对于三个单词不起作用，但会在你的情况下起作用）

"name"\:(.\w+\s*\w*"?)在每个结尾处都会引用"但不会获得财务报价。 Example

修改：修复了＆＃34;财务

的第二个正则表达式

Answer 2

我会使用带有尾随引号的非饥饿.*?表达式：

import re

string = """$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,3)","name":"Finance","$type":"voyager.identity.profile.Skill"},{"$deletedFields":["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,22)","name":"Financial ["standardizedSkillUrn","standardizedSkill"],"entityUrn":"urn:li:fs_skill:(ACoAAAIv9SQBMzclPm3CZzL1QceTH5W0VrsdxbE,34)","name":"Due Diligence","name":"Strategy"""

# With the leading double quote
match = re.compile(r'"name"\:(".*?)["\[]')
a = match.findall(string)
print a

# Stripping out the leading double quote
match = re.compile(r'"name"\:"(.*?)["\[]')
b = match.findall(string)
print b

最终输出是：

['"Finance', '"Financial ', '"Due Diligence']
['Finance', 'Financial ', 'Due Diligence']

Python RegEx在特定字符串后获取单词

2 个答案: