我有两个巨大的数据框: 第一个数据框:limdata
SACC_ID OPPLINE_LINE_ID OPP_CREATION_DATE
0 001A000000qqefQIAQ a0W1200000F5TWOEA3 2015-01-09
1 001A000000siuo7IAA a0W1200000JEmTdEAL 2017-01-02
2 001A000000qqCDcIAM a0W1200000H3FYTEA3 2016-01-15
3 001A0000014MJgpIAG a0W1200000F5TW9EAN 2015-01-09
4 001A000000ZdyuMIAR a0W1200000H11lHEAR 2015-12-10
5 001A000000aOmo4IAC a0W1200000H11n3EAB 2015-12-10
6 001A000000v6diCIAQ a0W1200000HkwfzEAB 2016-05-02
.....
151185 001A000000skyIMIAY a0WA000000EMTouMAH 2014-09-12
和第二个数据帧称为
SACC_PS CASE_ID CREATION_DATE
0 0011200001K64ncAAB 5001200000eXVMvAAO 2017-01-25 05:00:07
1 001A000000iUrwSIAS 5001200000eX7FMAA0 2017-01-25 05:06:38
2 001A0000011lNmnIAE 5001200000Xyi38AAB 2016-03-04 13:02:19
3 001A000000aOlebIAC 5001200000XyE0TAAV 2016-03-04 13:02:09
5 001A0000013XIPoIAO 5001200000XyG0LAAV 2016-03-04 13:02:12
7 001A000000aOkIoIAK 5001200000XyLT3AAN 2016-03-04 13:02:12
9 001A000000m5pCAIAY 5001200000XyKhsAAF 2016-03-04 13:02:12
11 001A000000yLcL4IAK 5001200000Xyg2wAAB 2016-03-04 13:02:12
....
12473746 001A000000aOkumIAC 5001200000gXsWHAA0 2017-05-02 16:20:59
我尝试使用此行代码合并这两个数据框:
case = pd.merge(limdata, hist, left_on='SACC_ID',right_on='SACC_PS')
但是我得到了与内存有关的错误:
MemoryError跟踪(最近的调用) 最后)在() ----> 1个案例= pd.merge(limdata,hist,left_on ='SACC_ID',right_on ='SACC_PS')
〜/ anaconda3 / envs / python3 / lib / python3.6 / site-packages / pandas / core / reshape / merge.py 在merge(left,right,how,on,left_on,right_on,left_index, right_index,排序,后缀,复制,指示符,验证) 56复制=复制,指示器=指示器, 57验证=验证) -> 58返回op.get_result() 59 60
〜/ anaconda3 / envs / python3 / lib / python3.6 / site-packages / pandas / core / reshape / merge.py 在get_result() 594 [(ldata,lindexers),(rdata,rindexers)], 595轴= [llabels.append(rlabels),join_index], -> 596 concat_axis = 0,copy = self.copy) 597 598 typ = self.left._constructor
〜/ anaconda3 / envs / python3 / lib / python3.6 / site-packages / pandas / core / internals.py 在concatenate_block_managers中(mgrs_indexers,axes,concat_axis,复制) 5201其他:5202 b = make_block( -> 5203 concatenate_join_units(join_units,concat_axis,copy = copy),5204 placement = placement) 5205个区块。附加(b)
〜/ anaconda3 / envs / python3 / lib / python3.6 / site-packages / pandas / core / internals.py 在concatenate_join_units(join_units,concat_axis,副本)中5336
concat_values = to_concat [0] 5337(如果复制并 concat_values.base不是None: -> 5338 concat_values = concat_values.copy()5339 else:5340 concat_values = _concat._concat_compat(to_concat,axis = concat_axis)MemoryError:
能帮我解决这个问题吗? 预先谢谢你
最佳