Question

由于属性的名称不同，我需要将键值对的键与正则表达式匹配。

可能的名称在字典中定义：

MyAttr  = [
    ('ref_nr', 'Reference|Referenz|Referenz-Nr|Referenznummer'),
    ('color', 'Color|color|tinta|farbe|Farbe'),
]

另一个字典中某项的导入属性：

ImportAttr  = [
    ('Referenz', 'Ref-Val'),
    ('color', 'red'),
]

现在，如果它是一个已知属性（在我的第一个字典MyAttr中定义），并且与所讨论属性的不同拼写匹配，那么我想返回导入属性的值。

for key, value in ImportAttr:
    if key == "Referenz-Nr" : ref      = value
    if key == "Farbe"       : color    = value

目标是返回可能属性的值（如果已知）。

print(ref)
print(color)

如果“ Referenz-Nr”和“ Farbe”是已知属性，则应返回该值。

显然，此伪代码不起作用，我只是无法理解实现正则表达式以进行键搜索的函数。

Answer 1

对我来说还不清楚，但也许你想要它：

#!/usr/bin/python3

MyAttr  = [
    ('ref_nr', 'Reference|Referenz|Referenz-Nr|Referenznummer'),
    ('color', 'Color|color|tinta|farbe|Farbe')
]

ImportAttr  = [
    ('Referenz', 'Ref-Val'),
    ('color', 'red'),
]

ref, color = None, None

for key, value in ImportAttr:
    if key in MyAttr[0][1].split('|'): 
        ref = value
    if key in MyAttr[1][1].split('|'): 
        color = value

print("ref: ", ref)
print("color: ", color)

拆分可以通过分隔符（此处为“ |”字符）将字符串拆分为字符串列表，然后您可以检查该列表中的键是否为字符串。

以下解决方案有些棘手。如果您不想将职位硬编码到源中，则可以使用locals（）。

#!/usr/bin/python3

MyAttr  = [
    ('ref', 'Reference|Referenz|Referenz-Nr|Referenznummer'),
    ('color', 'Color|color|tinta|farbe|Farbe')
]

ImportAttr  = [
    ('Referenz', 'Ref-Val'),
    ('color', 'red'),
]

ref, color = None, None

for var, names in MyAttr:
    for key, value in ImportAttr:
        if key in names.split('|'):
            locals()[var] = value
            break

print("ref: ", ref)
print("color: ", color)

Answer 2

如果需要，还可以使用pandas通过这种方式解决大型数据集的问题。

get_references_and_colors.py

import pandas as pd
import re
import json

def get_references_and_colors(lookups, attrs):
    responses = []

    refs = pd.Series(re.split(r"\|", lookups[0][0]))
    colors = pd.Series(re.split(r"\|", lookups[1][0]))
    d = {"ref": refs, "color": colors}
    df = pd.DataFrame(d).fillna('') # To drop NaN entries, in case if refs 
                                    # & colors are not of same length 

    #               ref  color
    # 0       Reference  Color
    # 1        Referenz  color
    # 2     Referenz-Nr  tinta
    # 3  Referenznummer  farbe
    # 4                  Farbe

    for key, value in attrs:
        response = {}
        response["for_attr"] = key

        df2 = df.loc[df["ref"] == key]; # find in 'ref' column

        if not df2.empty:
            response["ref"] = value
        else:
            df3 = df.loc[df["color"] == key]; # find in 'color' column
            if not df3.empty:
                response["color"] = value
            else:
                response["color"] = None # Not Available
                response["ref"] = None

        responses.append(response)

    return responses


if __name__ == "__main__":

    LOOKUPS  = [
         ('Reference|Referenz|Referenz-Nr|Referenznummer', 'a'),
         ('Color|color|tinta|farbe|Farbe', 'b'),
    ]

    ATTR  = [
        ('Referenz', 'Ref-Val'),
        ('color', 'red'),
        ('color2', 'orange'), # improper
        ('tinta', 'Tinta-col')
    ]

    responses = get_references_and_colors(LOOKUPS, ATTR) # dictionary
    pretty_response = json.dumps(responses, indent=4) # for pretty printing
    print(pretty_response)

输出量

[
    {
        "for_attr": "Referenz",
        "ref": "Ref-Val"
    },
    {
        "for_attr": "color",
        "color": "red"
    },
    {
        "for_attr": "color2",
        "color": null,
        "ref": null
    },
    {
        "for_attr": "tinta",
        "color": "Tinta-col"
    }
]

在python中针对不同对匹配密钥

2 个答案: