我有一个带有模型ID和相关值的数据框。这些列是日期,client_id,model_id,category1,category2,颜色和价格。我有一个简单的烧瓶应用程序,用户可以在其中选择型号ID,并将其添加到他们的“购买”历史记录中。基于模型ID,我想在数据框中添加一行,并带入category1,category2,颜色和价格的关联值。使用Pandas的最佳方法是什么?我知道在Excel中我会使用vlookup,但是我不确定如何使用Python进行操作。假设每个型号ID的category1,category2,颜色和价格都是唯一的。
client_id = input("ENTER Model ID: ")
model_id = input("ENTER Model ID: ")
def update_history(df, client_id, model_id):
today=pd.to_datetime('today')
#putting in tmp but just need to "lookup" these values from the original dataframe somehow
df.loc[len(df)]=[today, client_id, model_id, today, 'tmp', 'tmp','tmp', 'tmp']
return df
答案 0 :(得分:0)
下面的代码向现有数据框添加具有新值的新行。新值列表可以传递给函数。
导入库
/* Attach the date picker to a jQuery selection.
* @param target element - the target input field or division or span
* @param settings object - the new settings to use for this date picker instance (anonymous)
*/
_attachDatepicker: function( target, settings ) {
var nodeName, inline, inst;
nodeName = target.nodeName.toLowerCase();
inline = ( nodeName === "div" || nodeName === "span" );
if ( !target.id ) {
this.uuid += 1;
target.id = "dp" + this.uuid;
}
inst = this._newInst( $( target ), inline );
inst.settings = $.extend( {}, settings || {} );
if ( nodeName === "input" ) {
this._connectDatepicker( target, inst );
} else if ( inline ) {
this._inlineDatepicker( target, inst );
}
}
创建示例数据框
import pandas as pd
import numpy as np
import datetime
功能
model_id = ['M1', 'M2', 'M3']
today = ['2018-01-01', '2018-01-02', '2018-01-01']
client_id = ['C1', 'C2', 'C3']
category1 = ['orange', 'apple', 'beans']
category2 = ['fruit', 'fruit', 'grains']
df = pd.DataFrame({'today':today, 'model_id': model_id, 'client_id':client_id,
'category1': category1, 'category2':category2})
df['today'] = pd.to_datetime(df['today'])
df
调用函数以将具有新值的行追加到现有数据框
def update_history(df, client_id, model_id, category1, category2):
today=pd.to_datetime('today')
# Create a temp dataframe with new values.
# Column names in this dataframe should match the existing dataframe
temp = pd.DataFrame({'today':[today], 'model_id': [model_id], 'client_id':[client_id],
'category1': [category1], 'category2':[category2]})
df = df.append(temp)
return df
答案 1 :(得分:0)
您可以尝试一下。如果一次要添加多个行,则将字典添加到列表,然后一次将它们添加到数据帧会更快。
modelid = ['MOD1', 'MOD2', 'MOD3']
today = ['2018-07-15', '2018-07-18', '2018-07-20']
clients = ['CLA', 'CLA', 'CLB']
cat_1 = ['CAT1', 'CAT2', 'CAT3']
cat_2 = ['CAT11', 'CAT12', 'CAT13']
mdf = pd.DataFrame({"model_id": modelid, "today": today, "client_id": clients, "cat_1":cat_1, "cat_2":cat_2})
def update_history(df, client_id, model_id):
today = pd.to_datetime('today')
row = df[df.model_id==model_id].iloc[0]
rows_list = []
dict = {"today":today, "client_id":client_id,
"model_id":model_id,"cat_1":row["cat_1"],
"cat_2":row["cat_2"]}
rows_list.append(dict)
df2 = pd.DataFrame(rows_list)
df = df.append(df2)
return df
mdf = update_history(mdf,"CLC","MOD1")
答案 2 :(得分:0)
这就是我最终要做的。我仍然认为还有一个更优雅的解决方案,所以请让我知道!
#create dataframe
modelid = ['MOD1', 'MOD2', 'MOD3']
today = ['2018-07-15', '2018-07-18', '2018-07-20']
clients = ['CLA', 'CLA', 'CLB']
cat_1 = ['CAT1', 'CAT2', 'CAT3']
cat_2 = ['CAT11', 'CAT12', 'CAT13']
mdf = pd.DataFrame({"model_id": modelid, "today": today, "client_id": clients, "cat_1":cat_1, "cat_2":cat_2})
#reorder columns
mdf = mdf[['cat_1', 'cat_2', 'model_id', 'client_id', 'today']]
#create lookup table
lookup=mdf[['cat_1','cat_2','model_id']]
lookup.drop_duplicates(inplace=True)
#get values
client_id = input("ENTER Client ID: ")
model_id = input("ENTER Model ID: ")
#append model id to list
model_id_lst=[]
model_id_lst.append(model_id)
today=pd.to_datetime('today')
#grab associated cat_1, and cat_2 from lookup table
temp=lookup[lookup['model_id'].isin(model_id_lst)]
out=temp.values.tolist()
out[0].extend([client_id, today])
#add this as a row to the df
mdf.loc[len(mdf)]=out[0]