我在数据框中有一个带有条形码的列,并创建了一个字典将条形码映射到商品ID。
我正在创建一个新列:
df['item_id'] = df['bar_code']
字典(第二个数据帧-imdb-)
keys = (int(i) for i in imdb['bar_code'])
values = (int(i) for i in imdb['item_id'])
map_barcode = dict(zip(keys, values))
map_barcode(例如前5个)
{0:1000159, 9000000017515:11 7792690324216:16 7792690324209:20, 70942503334:33}
然后将项目ID与字典映射
df = df.replace({'item_id':map_barcode})
我希望在此获取列中的商品ID
(回到字典示例:)
df['item_id'][0] = 1000159
df['item_id'][1] = 11
df['item_id'][2] = 16
df['item_id'][3] = 20
df['item_id'][4] = 33
但是最终出现此错误:
Cannot compare types 'ndarray(dtype=int64)' and 'int64'
我试图将字典的类型更改为np.int64
keys = (np.int64(i) for i in imdb['bar_code'])
values = (np.int64(i) for i in imdb['item_id'])
map_barcode = dict(zip(keys, values))
但是有同样的错误。
这里有什么我想念的吗?
答案 0 :(得分:3)
SELECT Plate, Begin, StayEnd, Loc, DATEDIFF(StayEnd, Begin) As Count FROM (SELECT
Plate, Begin, max(`TimeStamp`) AS StayEnd, Loc FROM
(SELECT inven_table.*,
@f:=CONVERT(IF(@c<=>Plate AND @r<=>Loc AND DATEDIFF(`TimeStamp`, @d)<=1, @f,
`TimeStamp`), DATETIME) AS Begin,
@c:=Plate, @d:=`TimeStamp`, @r:=Loc
FROM inven_table JOIN (SELECT @c:=NULL) AS init
ORDER BY Plate,`TimeStamp`, Loc) AS t WHERE Plate = 'XXXXXX' GROUP BY Begin) As C
GROUP By Begin ORDER BY StayEnd DESC
示例首先,我无法重现您的错误。效果很好:
replace
结果:
map_dict = {0: 1000159, 9000000017515: 11, 7792690324216: 16, 7792690324209: 20, 70942503334: 33}
df = pd.DataFrame({'item_id': [0, 7792690324216, 70942503334, 9000000017515, -1, 7792690324209]})
df = df.replace({'item_id': map_dict})
item_id
0 1000159
1 16
2 33
3 11
4 -1
5 20
+ map
第二,在生成器表达式中手动迭代Pandas系列是相对昂贵的。此外,fillna
在通过字典进行映射时效率低下。
实际上,甚至没有必要创建字典。有针对这些任务的基于系列的优化方法:
replace
另请参阅: