嗨, 我有3个csv文件,如下所示
datetime, forecast
2016-02-02 00:00:00, 23.34
2016-02-02 00:10:00, 29.23
timestamp, forecast, v1, v2
2016-02-02 00:00:00, 68.56, 012, .23
2016-02-02 00:10:00, 23.24, .25, .32
timestamp, forecast[ma], v1
2016-02-02 00:00:00, 56.32, 32
2016-02-02 00:10:00, 25.21, 56
我希望我的输出具有
Time, Forecast, forecast1, forecast2
2016-02-02 00:00:00, 23.34, 68.56, 56.32
2016-02-02 00:10:00, 29.23, 23.24, 25.21
我创建了将xlsx中的这些文件与python组合在一起的代码。现在,我计划使用Shell进一步处理这些文件,因此我希望将此文件保存在csv中。
我尝试过类似的代码。
join -j 2 -o 1.1,1.2,2.2 <(sort -k2 $path_DMS/$file_name) <(sort -k2 $path_ISRO/$file_name)
谢谢
答案 0 :(得分:1)
请尝试以下操作(这在大多数awk
中都可以使用)。
awk '
BEGIN{
FS=OFS=", "
print "Time, Forecast, forecast1, forecast2"
}
FNR==1{
++count
next
}
count==1{
a[$1]=$2
next
}
count==2{
a[$1]=a[$1] OFS $2
next
}
count==3{
print $1,a[$1],$2
}' file1.csv file2.csv file3.csv
输出如下。
Time, Forecast, forecast1, forecast2
2016-02-02 00:00:00, 23.34, 68.56, 56.32
2016-02-02 00:10:00, 29.23, 23.24, 25.21
说明: 现在为上述代码添加详细说明。
awk ' ##Starting awk program here.
BEGIN{ ##Mentioning BEGIN section of awk which will execute before Input_file(s) getting read.
FS=OFS=", " ##Setting FS and OFS as ", " read man awk for FS and OFS too.
print "Time, Forecast, forecast1, forecast2" ##Printing headers for output.
} ##Closing BEGIN section here.
FNR==1{ ##Checking condition if this is first line of all Input_file(s).
++count ##Increment variable count with 1 here.
next ##next will skip all further statements from here.
} ##Closing FNR==1 BLOCK here.
count==1{ ##Checking if count==1 then do following.
a[$1]=$2 ##Creating an array a whose index $1 and value is $2.
next ##next will skip all further statements.
} ##Closing count==1 BLOCK here.
count==2{ ##Checking condition if count==2 then do following.
a[$1]=a[$1] OFS $2 ##Concatenate value of a[$1] to its previous value which it got from file1.csv
next ##next will skip all further statements from here.
} ##Closing count==2 BLOCK here.
count==3{ ##Checking condition if count==3 then do following.
print $1,a[$1],$2 ##Printing first field, a[$1] value and $2 of current line for file3.csv
}' file1.csv file2.csv file3.csv ##Mentioning all Input_file(s) names here.