我一直在尝试将两个csv文件与基于它们共享的列名称的字母数字数据组合在一起,以便我可以使用终端执行类似于它们的连接。
这是我尝试的内容:(我的两个文件的第一列完全相同)
加入-t,-1 1 -2 1 file_1.csv file_2.csv> file_3.csv
合并发生正常,我的列合并但不是我想要的格式。
问题: file_3由两个文件中的行组成,但用逗号分隔,但是用不同的行。
示例:
Columns from file_1
,Columns from file_2
Row1 from file_1
,Row1 from file_2
Row2 from file_1
,Row2 from file_2
如何在每行合并中将file_3数据放在一行中?任何指示即将继续。
file_1.csv :(示例数据)
Id,Age,Employment,Education,Marital,Occupation,Income,Gender,Deductions,Hours,Adjusted
1,38,Private,College,Unmarried,Service,81838,Female,0,72,0
2,35,Private,Associate,Absent,Transport,72099,Male,0,30,0
3,32,Private,HSgrad,Divorced,Clerical,154676.74,Male,0,40,0
file_2.csv :(示例数据)
Id,Adjusted,Predicted_Adjusted,Probability_0,Probability_1
1,0,0,0.952957896225136,0.0470421037748636 .
2,0,0,0.973664421132328,0.0263355788676716 .
3,0,0,0.966224074718457,0.0337759252815426
错误加入:
Id,Age,Employment,Education,Marital,Occupation,Income,Gender,Deductions,Hours,Adjusted
,Adjusted,Predicted_Adjusted,Probability_0,Probability_1
1,38,Private,College,Unmarried,Service,81838,Female,0,72,0
,0,0,0.952957896225136,0.0470421037748636
2,35,Private,Associate,Absent,Transport,72099,Male,0,30,0
,0,0,0.973664421132328,0.0263355788676716
3,32,Private,HSgrad,Divorced,Clerical,154676.74,Male,0,40,0
,0,0,0.966224074718457,0.0337759252815426
预期产量: 每两行实际上是一行,因此预期的输出不应该将行分成两行,而应该表示两个csv文件的同源合并,即file_1和file_2
答案 0 :(得分:2)
带有Windows换行符\r
的文件是什么?
您可以尝试dos2unix file_1.csv
和dos2unix file_2.csv
?
答案 1 :(得分:1)
这应该有效:
join -t , -1 1 -2 1 file_1.csv file_2.csv|paste -d' ' - - > file_3.csv