Question

我试图阅读其中包含成千上万行的大型文本文件，以使其更快我尝试使用熊猫，这是我想要我的代码的概念，我不确定如何使用循环与熊猫文件。 Lmk如果它的逻辑是有一个程序这样做，试图缩短运行时间。感谢..

df1 = pd.read_csv('FILENAME1',sep=',',error_bad_lines=False)
df2 = pd.read_csv('FILENAME2',sep=',',error_bad_lines=False)
for index, row in df1.iterrows():
    for index2, row2 in df2.iterrows():
        if index[1]==row2[2] and index[0]==row2[1]:
            print "this info matches"

Answer 1

在我看来，如果运行时很重要并且您只需要执行您在代码中显示的计算，请不要使用pandas。熊猫将花费额外的周期进行自我设置，进行数据清理等。

用panda读取文件，然后使用for循环

1 个答案: