使用Cygwin ( CYGWIN_NT-6.1 )
时,比较两个文件(file11.csv : 25.82 Mn rows, file22.csv : 4.1 Mn)
,
executing the Command awk -F "," 'NR==FNR{a[$2]=$0;next
}{print (a[$1]?a[$1]:"NotFound,NotFound") "," $0}' file11.csv file22.csv>Op_file33.csv
我收到此错误:
awk: cmd. line:1: (FILENAME=- FNR= fatal: more_nodes: nextfree: can't allocate 4000 bytes of memory (Cannot allocate memory),
。错误文件(gawk.exe.stackdump)包含:堆栈跟踪: 帧功能参数 002299A0 7710F003(00000118,0000EA60,00000000,00229AD4) 002299B4 7710EFB2(00000118,0000EA60,000000A4,00229AB0) 00229AD4 610DBE29(00000000,00000000,00229AD0,00229BC4) 00229BC4 610D915E(00000000,61102FA2,003B0023,00230000) 00229C24 610D962E(20000038,00000000,00229C64,00000006) 00229CD4 610D9780(00000500,00000006,00229D04,0022CE64) 00229CF4 610D97AC(00000006,0022CE80,0022CE64,0042DA60) 00229D24 610D9A85(0044F0F4,00000503,00000000,00000000) 00229D44 0042B773(00000004,00000001,00229E64,6103118A) 00229D54 691013B2(0000000B,00229DB8,00000000,00000000) 00229E64 6103118A(00000118,0000EA60,000000A4,00229F60) 00229F84 610DBEE2(00000004,0022A060,0000001C,00000000) 0022A094 610314F0(0022A180,0022FF14,0022A19C,0022A154) 0022A0B8 77AE65F9(0022A180,0022FF14,0022A19C,0022A154) 0022A168 77AE65CB(0022A180,0022A19C,0022A180,0022A19C) 0022A498 77AE6457(00000000,00000000,0022A4DC,77AF3B27) 堆栈跟踪结束(可能存在更多堆栈帧)
答案 0 :(得分:1)
您当前正在内存中存储更大的文件(file11.csv = 25M行)而不是较小的文件(file22.csv = 4M行)。
只需更改逻辑以存储文件22并将其与文件11中的行进行比较,当您阅读它们时,您可能会没事。