Question

我有一个如下所示的示例DataFrame：

 ID        Product    UPC    Units Sold 
 no link   cereal    3463    12
 2211      cereal    2211    13
 2211      cereal    8900    11
 2211      cereal    6754    14
 no link   cereal    9012    13 
 3340      cereal    3340    12
 3340      cereal    5436    15

＆＃39; ID＆＃39;列将类似产品标识为一个产品系列ID。 ID由该系列的第一个UPC编号创建。＆＃39;没有链接＆＃39;识别作为其家庭唯一成员的产品。我想要的是设置“没有链接”＃39;值默认为UPC编号。这就是我希望我的输出看起来像：

 ID        Product    UPC    Units Sold 
 3463      cereal    3463    12
 2211      cereal    2211    13
 2211      cereal    8900    11
 2211      cereal    6754    14
 9012      cereal    9012    13 
 3340      cereal    3340    12
 3340      cereal    5436    15

这是我到目前为止所做的：

 for row in product_families:
     if product_families.loc['Product Family Number'] == 'no link':

Answer 1

使用带有布尔索引的loc并让Pandas分配内部数据对齐：

df.loc[df.ID.eq('no link'),'ID'] = df.UPC

输出：

     ID Product   UPC  Units Sold
0  3463  cereal  3463          12
1  2211  cereal  2211          13
2  2211  cereal  8900          11
3  2211  cereal  6754          14
4  9012  cereal  9012          13
5  3340  cereal  3340          12
6  3340  cereal  5436          15

Answer 2

斯科特波士顿的解决方案应该有效。

也许值得尝试使用apply row wise的不同方法。

df['ID']=df.apply(lambda x: x.UPC if x.ID=='no link' else x.ID, axis=1)

如何使一些pandas列值默认为另一列中的另一个值，但是同一行？

2 个答案: