将列名称与来自其他DataFrame的数据相关联

时间:2020-05-29 19:51:37

标签: python pandas

所以,我有三个DataFrame:

  1. 其中一个包含每天(列)每项(行)的销售额。
  2. 一个包含一整天的列和一个包含星期的列。
  3. 其中一个包含所有项目的列,其每周价格和一个包含星期的列。

我想做的是修改第一个表,以便不显示销售,而是显示该数字乘以该商品当天的价格。那是(伪代码):

for all items (rows) and days (columns) in table1:

   table1(item, day) = table1(item, day) * table3.prices(item, table2.weeks(day))

我想避免循环并使用Pandas操作。

编辑:

表1是这样的:

+-------+------+------+------+
|  item | day1 | day2 | day3 |
+-------+------+------+------+
| item1 |   0  |   2  |   3  |
+-------+------+------+------+
| item2 |   1  |   5  |   3  |
+-------+------+------+------+
| item3 |  12  |   7  |   8  |
+-------+------+------+------+

表2:

+------+-------+
|  day |  week |
+------+-------+
| day1 | week1 |
+------+-------+
| day2 | week2 |
+------+-------+
| day3 | week2 |
+------+-------+

表3:

+-------+-------+-------+
|  item |  week | price |
+-------+-------+-------+
| item1 | week1 |   3   |
+-------+-------+-------+
| item1 | week1 |   4   |
+-------+-------+-------+
| item2 | week2 |   7   |
+-------+-------+-------+
| item2 | week2 |   9   |
+-------+-------+-------+
| item3 | week1 |   2   |
+-------+-------+-------+
| item3 | week2 |   3   |
+-------+-------+-------+

因此,预期输出将是(如果我计算正确的话):

+-------+------+------+------+
|  item | day1 | day2 | day3 |
+-------+------+------+------+
| item1 |   0  |   8  | 12   |
+-------+------+------+------+
| item2 |   7  |  45  | 27   |
+-------+------+------+------+
| item3 |  24  |  21  | 24   |
+-------+------+------+------+

1 个答案:

答案 0 :(得分:2)

我会进行双重合并:

# stack the fist table so we can use `day` information
s=tbl1.melt('item', value_name='unit', var_name='day')

(s.merge(tbl3.merge(tbl2,on='week',how='inner'),
       on=['item','day'], how='outer')
  .assign(total=lambda x: x['unit']*x['price'])
   .pivot(index='item',columns='day',values='total')
)

输出:

day    day1  day2  day3
item                   
item1     0     8    12
item2     7    45    27
item3    24    21    24

更新

另一种方法是按天获取每日价格并相乘:

daily = (tbl3.merge(tbl2,on='week',how='inner')
     .set_index(['item','day']).price
     .unstack()
)

output = tbl1.set_index('item') * daily

输出:

       day1  day2  day3
item                   
item1     0     8    12
item2     7    45    27
item3    24    21    24