Python:数组搜索改进

时间:2018-05-31 03:59:06

标签: python arrays list

python的新手,我没有完全找到改善它的方法(即使我知道这不是正确的方法)。

我有一个这种格式的元素列表:高CPU,MID-GPU,高RAM 有27种可能的组合,我希望将每个组合添加到pandas数据帧中作为新值的族组合。

IE:对于高CPU,MID-GPU,HIGH-RAM将是家庭1。

这就是我的工作,但如果明天我必须添加更多可能的组合,它应该更简单:

def match_list(row):
    fam_list = ""
    tomatch = row
    ##This CSV contains the possible 27 combinations said above.
    result = pd.read_csv("family_list.csv")

    for row in result :
        if tomatch == "HIGH-CPU, MID-GPU, HIGH-RAM":
            fam_list = "Family 1"
        elif tomatch == "HIGH-CPU, MID-GPU, MID-RAM":
            fam_list = "Family 2"           
        elif tomatch == "HIGH-CPU, MID-GPU, LOW-RAM":
            fam_list = "Family 3"         
        elif tomatch == "HIGH-CPU, HIGH-GPU, HIGH-RAM":
            fam_list = "Family 4"            
        elif tomatch == "HIGH-CPU, HIGH-GPU, MID-RAM":
            fam_list = "Family 5"
        elif tomatch == "HIGH-CPU, HIGH-GPU, LOW-RAM":
            fam_list = "Family 6"
        elif tomatch == "HIGH-CPU, LOW-GPU, HIGH-RAM":
            fam_list = "Family 7"
        elif tomatch == "HIGH-CPU, LOW-GPU, MID-RAM":
            fam_list = "Family 8"
        elif tomatch == "HIGH-CPU, LOW-GPU, LOW-RAM":
            fam_list = "Family 9"
        elif tomatch == "MID-CPU, MID-GPU, HIGH-RAM":
            fam_list = "Family 10"
        elif tomatch == "MID-CPU, MID-GPU, MID-RAM":
            fam_list = "Family 11"
        elif tomatch == "MID-CPU, MID-GPU, LOW-RAM":
            fam_list = "Family 12"
        elif tomatch == "MID-CPU, HIGH-GPU, HIGH-RAM":
            fam_list = "Family 13"
        elif tomatch == "MID-CPU, HIGH-GPU, MID-RAM":
            fam_list = "Family 14"
        elif tomatch == "MID-CPU, HIGH-GPU, LOW-RAM":
            fam_list = "Family 15"
        elif tomatch == "MID-CPU, LOW-GPU, HIGH-RAM":
            fam_list = "Family 16"
        elif tomatch == "MID-CPU, LOW-GPU, MID-RAM":
            fam_list = "Family 17"
        elif tomatch == "MID-CPU, LOW-GPU, LOW-RAM":
            fam_list = "Family 18"
        elif tomatch == "LOW-CPU, MID-GPU, HIGH-RAM":
            fam_list = "Family 19"
        elif tomatch == "LOW-CPU, MID-GPU, MID-RAM":
            fam_list = "Family 20"
        elif tomatch == "LOW-CPU, MID-GPU, LOW-RAM":
            fam_list = "Family 21"
        elif tomatch == "LOW-CPU, HIGH-GPU, HIGH-RAM":
            fam_list = "Family 22"
        elif tomatch == "LOW-CPU, HIGH-GPU, MID-RAM":
            fam_list = "Family 23"
        elif tomatch == "LOW-CPU, HIGH-GPU, LOW-RAM":
            fam_list = "Family 24"
        elif tomatch == "LOW-CPU, LOW-GPU, HIGH-RAM":
            fam_list = "Family 25"
        elif tomatch == "LOW-CPU, LOW-GPU, MID-RAM":
            fam_list = "Family 26"
        elif tomatch == "LOW-CPU, LOW-GPU, LOW-RAM":
            fam_list = "Family 27" 
        else:
            fam_list = np.nan

    return fam_list


    df['family_class'] = df['merged_cells'].apply(match_list)

那么,我如何将其转换为更小的代码,我可以实际迭代和数组?我正在考虑只是看看它是否在阵列中然后这样做。但是,我怎样才能确保我不创造超过27个家庭?

1 个答案:

答案 0 :(得分:1)

为函数外的外部制作dict映射测试值,并在其中使用:

combo_to_family = {"HIGH-CPU, MID-GPU, HIGH-RAM": "Family 1",
                   "HIGH-CPU, MID-GPU, MID-RAM": "Family 2",
                   ...,
                   "LOW-CPU, LOW-GPU, LOW-RAM": "Family 27",
                   }

在函数内部,您的循环简化为:

for row in result :
    fam_list = combo_to_family.get(tomatch, np.nan)

(注意:你循环迭代,但只使用作为参数传递的tomatch,忽略循环的row,不确定目标是什么,但逻辑需要仔细检查)

显然,输入整个dict是一种痛苦。如果您可以自由地定义家庭编号,则可以使用itertools以编程方式生成dict

 from itertools import product

 combo_to_family = {'{}-CPU, {}-GPU, {}-RAM'.format(c, g, r): 'Family {}'.format(i)
                    for i, (c, g, r) in enumerate(product(('LOW', 'MID', 'HIGH'), repeat=3), 1)}

家庭编号与您的编号不相符,但如果精确编号并不重要,只需将它们区分开来,就可以轻松地生成所有27种组合。