通过多个词典观察功能?

时间:2019-06-05 18:45:19

标签: python pandas dataframe

如何将多个字典观察值(​​行)传递给函数以进行模型预测?

这就是我所拥有的...它可以接受1个字典行作为输入并返回预测+概率,但是在添加其他字典时失败。

import json

# func
def preds(dict):
    df = pd.DataFrame([dict])
    result = model.predict(df)
    result = np.where(result==0,"CLASS_0","CLASS_1").astype('str')
    probas_c0 = model.predict_proba(df)[0][0]
    probas_c1 = model.predict_proba(df)[0][1]
    data={"prediction": result[0],
                      "CLASS_0_PROB": probas_c0,
                      "CLASS_1_PROB": probas_c1}
    data = {"parameters": [data]}
    j = json.dumps(data)
    j = json.loads(j)
    return j

# call func
preds({"feature0": "value",
  "feature1": "value",
  "feature2": "value"})

# result
{'parameters': [{'prediction': 'CLASS_0',
   'CLASS_0_PROB': 0.9556067383610446,
   'CLASS_1_PROB': 0.0443932616389555}]}
# Tried with more than 1 row but it fails with arguments error
{'parameters': [{'prediction': 'CLASS_0',
   'CLASS_0_PROB': 0.9556067383610446,
   'CLASS_1_PROB': 0.0443932616389555},
 {'parameters': [{'prediction': 'CLASS_0',
   'CLASS_0_PROB': 0.9556067383610446,
   'CLASS_1_PROB': 0.0443932616389555}]}

TypeError: preds() takes 1 positional argument but 2 were given

新更新

最终用户的源数据格式很可能是一个数据框,因此要将其转换为[{...},{...}]格式,以便可以在此处将其插入preds()函数中df=pd.DataFrame([rows])

到目前为止已经尝试过了...

rows = [
{"c1": "value1",
  "c2": "value2",
  "c3": 0,
},
{"c1": "value1,
  "c2": "value2,
  "c3": 0}
]

df = pd.DataFrame(rows)
json_rows = df.to_json(orient='records',  lines=True)
l = [json_rows]
preds(l)

KeyError: "None of [['c1', 'c2', 'c3']] are in the [columns]"

1 个答案:

答案 0 :(得分:2)

已更新

好吧,根据您的评论,您需要的是DataFrame获取所有行,然后可以使用下一个方法

使用*args

def preds(*args):
     # args is tuple you need to cast as list
     dict_rows = list(args)
     df = pd.DataFrame(dict_rows)
     result = model.predict(df)
     ...

# calling the function you need to unpack
preds(*rows)

事先检查元素

def preds(dict_rows):
    # checking if dict_rows is a list or a dict
    if isinstance(dict_rows, dict)
        dict_rows = [dict_rows]
    df = pd.DataFrame(dict_rows)
    result = model.predict(df)
    ...

# For calling you need to
preds(rows)

请注意,pd.DataFrame(dict_rows)不接受[dict]

旧Anwser

如果preds()无法处理多行,您可以

pred_rows = [
     {"feature0": "value","feature1": "value", "feature2": "value"}
     {"feature3": "value","feature4": "value", "feature5": "value"}
]
# List Comprehension
result = [preds(row) for row in pred_rows]

PS:也不要使用dict作为变量名,而是Mapping Type,这是字典的构造函数/类