Perforimg熊猫内部合并时出现MemoryError

时间:2020-08-05 10:31:25

标签: python pandas merge

我正在尝试使用熊猫合并两个文件,其中一个很大(6gb)。每次尝试时,都会出现内存错误,因为我的RAM(8gb)可能太小而无法处理。关于如何解决此问题的任何想法? 我的代码是:

 import pandas as pd
broad_matched = pd.read_csv("FILE A", delim_whitespace=True)
broad_matched2 = broad_matched[~(broad_matched['P'] >= 0.05)]
SNPs= pd.read_csv("FILE B", 
                  sep='\t', 
                 names=["#CHROM","POS1","POS", "rsID","E","F"])
broad_matched2=broad_matched2.drop(columns=['LOG.OR._SE','ID','REF','ALT','ERRCODE','Z_STAT','OR','OBS_CT','TEST','FIRTH.','A1','#CHROM'])
Table1=pd.merge(broad_matched2,SNPs,on='POS',how='inner').dropna()
Table1.to_csv(r'D:/Table1', index = False)

1 个答案:

答案 0 :(得分:0)

您应该看看this post。该解决方案涉及使用dask数据帧。