Question

我在字典功能中无法显示正确的命名捕获。我的程序读取一个.txt文件，然后将该文件中的文本转换成字典。我已经有了正确的正则表达式公式来捕获它们。

这是我的File.txt：

file Science/Chemistry/Quantum 444 1
file Marvel/CaptainAmerica 342 0
file DC/JusticeLeague/Superman 300 0
file Math 333 0
file Biology 224 1

这里是regex link能够捕获我想要的人：

通过查看链接，我要显示的链接以绿色和橙色突出显示。

我的代码的这一部分有效：

rx= re.compile(r'file (?P<path>.*?)( |\/.*?)? (?P<views>\d+).+')
i = sub_pattern.match(data) # 'data' is from the .txt file
x = (i.group(1), i.group(3))
print(x)

但是由于我将.txt制作成字典，所以我不知道如何将.group（1）或.group（3）作为要专门用于显示功能的键。我不知道如何在使用print("Title: %s | Number: %s" % (key[1], key[3]))时显示这些组，它将显示那些内容。我希望有人可以帮助我在字典功能中实现它。

这是我的字典功能：

def create_dict(data):
    dictionary = {}
    for line in data:
      line_pattern = re.findall(r'file (?P<path>.*?)( |\/.*?)? (?P<views>\d+).+', line)
      dictionary[line] = line_pattern
      content = dictionary[line]
      print(content)
    return dictionary

我正在尝试使文本文件中的输出看起来像这样：

Science 444
Marvel 342
DC 300
Math 333
Biology 224

Answer 1

您已经在“ line_pattern”中使用了命名组，只需将其放入字典中即可。 re.findall在这里不起作用。同样，'/'之前的字符转义符'\'也是多余的。因此，您的字典功能将是：

def create_dict(data):
    dictionary = {}
    for line in data:
        line_pattern = re.search(r'file (?P<path>.*?)( |/.*?)? (?P<views>\d+).+', line)
    dictionary[line_pattern.group('path')] = line_pattern.group('views')
    content = dictionary[line]
    print(content)
    return dictionary

Answer 2

您可以使用以下文件创建字典并使用文件数据填充字典

def create_dict(data):
    dictionary = {}
    for line in data:
        m = re.search(r'file\s+([^/\s]*)\D*(\d+)', line)
        if m:
            dictionary[m.group(1)] = m.group(2)
    return dictionary

基本上，它会执行以下操作：

定义dictionary字典
逐行读取data
搜索file\s+([^/\s]*)\D*(\d+)匹配项，如果存在匹配项，则使用两个捕获组值形成字典键值对。

我建议的正则表达式是

file\s+([^/\s]*)\D*(\d+)

请参见Regulex graph对其进行解释：

然后，您可以像使用它

res = {}
with open(filepath, 'r') as f:
    res = create_dict(f)
print(res)

请参见Python demo。

Answer 3

This RegEx可以帮助您将输入分为四个组，其中第2组和第4组是目标组，可以简单地提取它们并用 space 隔开：

 (file\s)([A-Za-z]+(?=\/|\s))(.*)(\d{3})

正则表达式，用于使用字典键捕获组

3 个答案: