我喜欢:
public List<string> listGhe(string ten, int movie_id)
{
slotList.Clear();
dbConnection.Open();
try
{
string sql = "Select b.Ghe_num from Customers c inner JOIN bookings b ON b.cus_id = c.id and movie_id = " + movie_id + " WHERE c.`name` = '" + ten + "'";
MySqlCommand cmd = new MySqlCommand();
cmd.Connection = dbConnection;
cmd.CommandText = sql;
using (DbDataReader reader = cmd.ExecuteReader())
{
if (reader.HasRows)
{
while (reader.Read())
{
int gheNumIndex = reader.GetOrdinal("Ghe_num");
string gheNum = reader.GetString(gheNumIndex);
slotList.Add(gheNum);
}
}
}
}
catch
{
}
finally
{
dbConnection.Close();
dbConnection.Dispose();
//dbConnection = null;
}
return slotList;
}
和df1喜欢:
material plant Order
24990 89952 4568789,5098710
24990 89952 9448609,1007081
166621 3062 18364103
166621 3062 78309139
240758 3062 55146035
276009 3062 38501581,857542
我想在df1中迭代Order,当df2中有Order匹配时,从m1到m5找到平均值。 我想实现df2,如:
material plant Order m1 m2 m3 m4 m5
24990 89952 4568789 0.123 0.214 0.0 0.0 0.0
24990 89952 5098710 1.000 0.363 0.0 0.0 0.0
24990 89952 9448609 0.0 0.345 0.0 1.0 0.0
24990 89952 1007081 0.0 0.756 0.0 1.0 0.0
166621 3062 18364103 0.0 0.0 0.0 0.0 0.0
166621 3062 78309139 0.0 1.0 0.0 0.0 0.0
240758 3062 55146035 1.0 1.0 1.0 0.0 0.0
276009 3062 38501581 1.0 1.0 1.0 0.0 0.0
276009 3062 38575428 1.0 1.0 1.0 0.0 0.0
我正在尝试不同的方法来实现df2,例如:
material plant Order avg m1 avgm2 avgm3 avgm4 avgm5
24990 89952 4568789,5098710 0.5615 0.2885 0.0 0.0 0.0
24990 89952 9448609,1007081
166621 3062 18364103
166621 3062 78309139
240758 3062 55146035
276009 3062 38501581,857542
第二
df2 = (df.groupby(df1, sort=False)['Order'].apply(lambda x: ','.split(x.astype(str)))
.mean()
.reset_index()
.reindex(columns=df.columns))
print (df2)
但我不确定这是否是正确的方法。
答案 0 :(得分:2)
这是通过numpy
和映射字典的一种方式。
# map Order to values with a dictionary
mapper = dict(zip(df1['Order'], df1[['m'+str(i) for i in range(1, 6)]].values))
# map comma-separated numbers to list of integers
df2_orders = [list(map(int, i)) for i in df2['Order'].str.split(',')]
# calculate mean
res = [np.mean([mapper.get(o, [0]*5) for o in order], axis=0).tolist() \
for order in df2_orders]
# join results to dataframe
df2 = df2.join(pd.DataFrame(res, columns=['avg_m'+str(i) for i in range(1, 6)]))
注意,如果缺少数据(例如,订单857542
),您可以指定要包含的值,这里我使用0。
<强>结果强>
material plant Order avg_m1 avg_m2 avg_m3 avg_m4 avg_m5
0 24990 89952 4568789,5098710 0.5615 0.2885 0.0 0.0 0.0
1 24990 89952 9448609,1007081 0.0000 0.5505 0.0 1.0 0.0
2 166621 3062 18364103 0.0000 0.0000 0.0 0.0 0.0
3 166621 3062 78309139 0.0000 1.0000 0.0 0.0 0.0
4 240758 3062 55146035 1.0000 1.0000 1.0 0.0 0.0
5 276009 3062 38501581,857542 0.5000 0.5000 0.5 0.0 0.0
答案 1 :(得分:1)
您可以使用:
df = (df1.join(df1.set_index(['material','plant'], append=True)['Order']
.str.split(',', expand=True)
.stack()
.astype(int)
.reset_index(name='Order')
.merge(df2, on=['material','plant','Order'], how='left')
.drop(['material','plant','Order','level_3'], axis=1)
.groupby('level_0')
.mean())
)
print (df)
material plant Order m1 m2 m3 m4 m5
0 24990 89952 4568789,5098710 0.5615 0.2885 0.0 0.0 0.0
1 24990 89952 9448609,1007081 0.0000 0.5505 0.0 1.0 0.0
2 166621 3062 18364103 0.0000 0.0000 0.0 0.0 0.0
3 166621 3062 78309139 0.0000 1.0000 0.0 0.0 0.0
4 240758 3062 55146035 1.0000 1.0000 1.0 0.0 0.0
5 276009 3062 38501581,857542 1.0000 1.0000 1.0 0.0 0.0
<强>解释强>:
split
和stack
DataFrame
merge
加入第二个DataFrame
并离开加入drop
mean
join
为DataFrame
<强>详细强>:
df3 = (df1.set_index(['material','plant'], append=True)['Order']
.str.split(',', expand=True)
.stack()
.astype(int)
.reset_index(name='Order')
.merge(df2, on=['material','plant','Order'], how='left'))
print (df3)
level_0 material plant level_3 Order m1 m2 m3 m4 m5
0 0 24990 89952 0 4568789 0.123 0.214 0.0 0.0 0.0
1 0 24990 89952 1 5098710 1.000 0.363 0.0 0.0 0.0
2 1 24990 89952 0 9448609 0.000 0.345 0.0 1.0 0.0
3 1 24990 89952 1 1007081 0.000 0.756 0.0 1.0 0.0
4 2 166621 3062 0 18364103 0.000 0.000 0.0 0.0 0.0
5 3 166621 3062 0 78309139 0.000 1.000 0.0 0.0 0.0
6 4 240758 3062 0 55146035 1.000 1.000 1.0 0.0 0.0
7 5 276009 3062 0 38501581 1.000 1.000 1.0 0.0 0.0
8 5 276009 3062 1 857542 NaN NaN NaN NaN NaN