Question

python的新手，我没有完全找到改善它的方法（即使我知道这不是正确的方法）。

我有一个这种格式的元素列表：高CPU，MID-GPU，高RAM 有27种可能的组合，我希望将每个组合添加到pandas数据帧中作为新值的族组合。

IE：对于高CPU，MID-GPU，HIGH-RAM将是家庭1。

这就是我的工作，但如果明天我必须添加更多可能的组合，它应该更简单：

def match_list(row):
    fam_list = ""
    tomatch = row
    ##This CSV contains the possible 27 combinations said above.
    result = pd.read_csv("family_list.csv")

    for row in result :
        if tomatch == "HIGH-CPU, MID-GPU, HIGH-RAM":
            fam_list = "Family 1"
        elif tomatch == "HIGH-CPU, MID-GPU, MID-RAM":
            fam_list = "Family 2"           
        elif tomatch == "HIGH-CPU, MID-GPU, LOW-RAM":
            fam_list = "Family 3"         
        elif tomatch == "HIGH-CPU, HIGH-GPU, HIGH-RAM":
            fam_list = "Family 4"            
        elif tomatch == "HIGH-CPU, HIGH-GPU, MID-RAM":
            fam_list = "Family 5"
        elif tomatch == "HIGH-CPU, HIGH-GPU, LOW-RAM":
            fam_list = "Family 6"
        elif tomatch == "HIGH-CPU, LOW-GPU, HIGH-RAM":
            fam_list = "Family 7"
        elif tomatch == "HIGH-CPU, LOW-GPU, MID-RAM":
            fam_list = "Family 8"
        elif tomatch == "HIGH-CPU, LOW-GPU, LOW-RAM":
            fam_list = "Family 9"
        elif tomatch == "MID-CPU, MID-GPU, HIGH-RAM":
            fam_list = "Family 10"
        elif tomatch == "MID-CPU, MID-GPU, MID-RAM":
            fam_list = "Family 11"
        elif tomatch == "MID-CPU, MID-GPU, LOW-RAM":
            fam_list = "Family 12"
        elif tomatch == "MID-CPU, HIGH-GPU, HIGH-RAM":
            fam_list = "Family 13"
        elif tomatch == "MID-CPU, HIGH-GPU, MID-RAM":
            fam_list = "Family 14"
        elif tomatch == "MID-CPU, HIGH-GPU, LOW-RAM":
            fam_list = "Family 15"
        elif tomatch == "MID-CPU, LOW-GPU, HIGH-RAM":
            fam_list = "Family 16"
        elif tomatch == "MID-CPU, LOW-GPU, MID-RAM":
            fam_list = "Family 17"
        elif tomatch == "MID-CPU, LOW-GPU, LOW-RAM":
            fam_list = "Family 18"
        elif tomatch == "LOW-CPU, MID-GPU, HIGH-RAM":
            fam_list = "Family 19"
        elif tomatch == "LOW-CPU, MID-GPU, MID-RAM":
            fam_list = "Family 20"
        elif tomatch == "LOW-CPU, MID-GPU, LOW-RAM":
            fam_list = "Family 21"
        elif tomatch == "LOW-CPU, HIGH-GPU, HIGH-RAM":
            fam_list = "Family 22"
        elif tomatch == "LOW-CPU, HIGH-GPU, MID-RAM":
            fam_list = "Family 23"
        elif tomatch == "LOW-CPU, HIGH-GPU, LOW-RAM":
            fam_list = "Family 24"
        elif tomatch == "LOW-CPU, LOW-GPU, HIGH-RAM":
            fam_list = "Family 25"
        elif tomatch == "LOW-CPU, LOW-GPU, MID-RAM":
            fam_list = "Family 26"
        elif tomatch == "LOW-CPU, LOW-GPU, LOW-RAM":
            fam_list = "Family 27" 
        else:
            fam_list = np.nan

    return fam_list


    df['family_class'] = df['merged_cells'].apply(match_list)

那么，我如何将其转换为更小的代码，我可以实际迭代和数组？我正在考虑只是看看它是否在阵列中然后这样做。但是，我怎样才能确保我不创造超过27个家庭？

Answer 1

为函数外的外部制作dict映射测试值，并在其中使用：

combo_to_family = {"HIGH-CPU, MID-GPU, HIGH-RAM": "Family 1", "HIGH-CPU, MID-GPU, MID-RAM": "Family 2", ..., "LOW-CPU, LOW-GPU, LOW-RAM": "Family 27", }

在函数内部，您的循环简化为：

for row in result : fam_list = combo_to_family.get(tomatch, np.nan)

（注意：你循环迭代，但只使用作为参数传递的tomatch，忽略循环的row，不确定目标是什么，但逻辑需要仔细检查）

显然，输入整个dict是一种痛苦。如果您可以自由地定义家庭编号，则可以使用itertools以编程方式生成dict：

from itertools import product combo_to_family = {'{}-CPU, {}-GPU, {}-RAM'.format(c, g, r): 'Family {}'.format(i) for i, (c, g, r) in enumerate(product(('LOW', 'MID', 'HIGH'), repeat=3), 1)}

家庭编号与您的编号不相符，但如果精确编号并不重要，只需将它们区分开来，就可以轻松地生成所有27种组合。

Python：数组搜索改进

1 个答案: