python的新手,我没有完全找到改善它的方法(即使我知道这不是正确的方法)。
我有一个这种格式的元素列表:高CPU,MID-GPU,高RAM 有27种可能的组合,我希望将每个组合添加到pandas数据帧中作为新值的族组合。
IE:对于高CPU,MID-GPU,HIGH-RAM将是家庭1。
这就是我的工作,但如果明天我必须添加更多可能的组合,它应该更简单:
def match_list(row):
fam_list = ""
tomatch = row
##This CSV contains the possible 27 combinations said above.
result = pd.read_csv("family_list.csv")
for row in result :
if tomatch == "HIGH-CPU, MID-GPU, HIGH-RAM":
fam_list = "Family 1"
elif tomatch == "HIGH-CPU, MID-GPU, MID-RAM":
fam_list = "Family 2"
elif tomatch == "HIGH-CPU, MID-GPU, LOW-RAM":
fam_list = "Family 3"
elif tomatch == "HIGH-CPU, HIGH-GPU, HIGH-RAM":
fam_list = "Family 4"
elif tomatch == "HIGH-CPU, HIGH-GPU, MID-RAM":
fam_list = "Family 5"
elif tomatch == "HIGH-CPU, HIGH-GPU, LOW-RAM":
fam_list = "Family 6"
elif tomatch == "HIGH-CPU, LOW-GPU, HIGH-RAM":
fam_list = "Family 7"
elif tomatch == "HIGH-CPU, LOW-GPU, MID-RAM":
fam_list = "Family 8"
elif tomatch == "HIGH-CPU, LOW-GPU, LOW-RAM":
fam_list = "Family 9"
elif tomatch == "MID-CPU, MID-GPU, HIGH-RAM":
fam_list = "Family 10"
elif tomatch == "MID-CPU, MID-GPU, MID-RAM":
fam_list = "Family 11"
elif tomatch == "MID-CPU, MID-GPU, LOW-RAM":
fam_list = "Family 12"
elif tomatch == "MID-CPU, HIGH-GPU, HIGH-RAM":
fam_list = "Family 13"
elif tomatch == "MID-CPU, HIGH-GPU, MID-RAM":
fam_list = "Family 14"
elif tomatch == "MID-CPU, HIGH-GPU, LOW-RAM":
fam_list = "Family 15"
elif tomatch == "MID-CPU, LOW-GPU, HIGH-RAM":
fam_list = "Family 16"
elif tomatch == "MID-CPU, LOW-GPU, MID-RAM":
fam_list = "Family 17"
elif tomatch == "MID-CPU, LOW-GPU, LOW-RAM":
fam_list = "Family 18"
elif tomatch == "LOW-CPU, MID-GPU, HIGH-RAM":
fam_list = "Family 19"
elif tomatch == "LOW-CPU, MID-GPU, MID-RAM":
fam_list = "Family 20"
elif tomatch == "LOW-CPU, MID-GPU, LOW-RAM":
fam_list = "Family 21"
elif tomatch == "LOW-CPU, HIGH-GPU, HIGH-RAM":
fam_list = "Family 22"
elif tomatch == "LOW-CPU, HIGH-GPU, MID-RAM":
fam_list = "Family 23"
elif tomatch == "LOW-CPU, HIGH-GPU, LOW-RAM":
fam_list = "Family 24"
elif tomatch == "LOW-CPU, LOW-GPU, HIGH-RAM":
fam_list = "Family 25"
elif tomatch == "LOW-CPU, LOW-GPU, MID-RAM":
fam_list = "Family 26"
elif tomatch == "LOW-CPU, LOW-GPU, LOW-RAM":
fam_list = "Family 27"
else:
fam_list = np.nan
return fam_list
df['family_class'] = df['merged_cells'].apply(match_list)
那么,我如何将其转换为更小的代码,我可以实际迭代和数组?我正在考虑只是看看它是否在阵列中然后这样做。但是,我怎样才能确保我不创造超过27个家庭?
答案 0 :(得分:1)
为函数外的外部制作dict
映射测试值,并在其中使用:
combo_to_family = {"HIGH-CPU, MID-GPU, HIGH-RAM": "Family 1",
"HIGH-CPU, MID-GPU, MID-RAM": "Family 2",
...,
"LOW-CPU, LOW-GPU, LOW-RAM": "Family 27",
}
在函数内部,您的循环简化为:
for row in result :
fam_list = combo_to_family.get(tomatch, np.nan)
(注意:你循环迭代,但只使用作为参数传递的tomatch
,忽略循环的row
,不确定目标是什么,但逻辑需要仔细检查)
显然,输入整个dict
是一种痛苦。如果您可以自由地定义家庭编号,则可以使用itertools
以编程方式生成dict
:
from itertools import product
combo_to_family = {'{}-CPU, {}-GPU, {}-RAM'.format(c, g, r): 'Family {}'.format(i)
for i, (c, g, r) in enumerate(product(('LOW', 'MID', 'HIGH'), repeat=3), 1)}
家庭编号与您的编号不相符,但如果精确编号并不重要,只需将它们区分开来,就可以轻松地生成所有27种组合。