我有以下内容:
file1.csv
"Id","clientName1","clientName2"
file2.csv
"Id","Name1","Name2"
我想按顺序阅读 file1 。对于每条记录,我想检查 file2 中是否存在匹配的Id
。可能有多个比赛。对于每个匹配,我想将Name1, Name2
附加到 file1.csv
所以,如果记录在 file2 中有多个匹配,可能会产生结果:
"Id","clientName1","clientName2","Name1","Name2","Name1","Name2"
答案 0 :(得分:0)
我担心bash可能不是有效的解决方案,但以下bash脚本可以工作:
#!/bin/bash
declare -A id_hash
while read line; do
id=$(echo $line | cut -d ',' -f 1)
name=$(echo $line | cut -d ',' -f 2-)
if [ -z "${id_hash[$id]}" ]; then
id_hash[$id]=$name
else
id_hash[$id]=${id_hash[$id]},$name
fi
done < file1.csv
while read line; do
id=$(echo $line | cut -d ',' -f 1)
name=$(echo $line | cut -d ',' -f 2-)
if [ -z "${id_hash[$id]}" ]; then
id_hash[$id]=$name
else
id_hash[$id]=${id_hash[$id]},$name
fi
done < file2.csv
for id in ${!id_hash[@]}; do
echo $id,${id_hash[$id]}
done
答案 1 :(得分:0)
作为对OP's clarification in his/her comment的回复,以下是单 awk
命令的修订版,如果在file1或file2中存在重复的ID,则会合并两者以及是否具有不同数量的字段。 old version which it works for OP's current stated question
awk -F',' '{one=$1;$1="";a[one]=a[one]$0} END{for (i in a) print i""a[i]}' OFS=, file[12]
输入:
<强>文件1 强>
"Id1","clientN1","clientN2" "Id2","Name3","Name4" "Id3","client00","client01","client02" "Id1","client1","client2","client3"
<强> file2的强>
"Id1","Name1","Name2" "Id1","Name3","Name4" "Id2","Name0","Name1" "Id2","Name00","Name11","Name22"
输出在同一 ID 上合并 file1
和 file2
:
"Id1","clientN1","clientN2","client1","client2","client3","Name1","Name2","Name3","Name4"
"Id2","Name3","Name4","Name0","Name1","Name00","Name11","Name22"
"Id3","client00","client01","client02"
答案 2 :(得分:0)
使用join
和GNU sed
join -t , -a 1 file[12].csv | sed -r '$!N;/^(.*,)(.*)\n\1/!P;s//\n\1\2,/;D'
假设file1.csv和file2.csv都按id排序,没有标题
<强> file1.csv 强>
1,c11,c12
2,c21,c22
3,c31,c32
<强> file2.csv 强>
1,n11,n12
1,n21,n22
1,n31,n32
2,n41,n42
给出
的结果1,c11,c12,n11,n12,n21,n22,n31,n32
2,c21,c22,n41,n42
3,c31,c32
<强>更新强>
如果file1.csv
可能包含重复ID 和各种字段长度,我建议执行预处理以确保{{1}在加入file1.csv
file2.csv
awk -F, '{for(i=2;i<=NF;i++) print $1 FS $i}' file1.csv |\
sort -u |\
sed -r '$!N;/^(.*,)(.*)\n\1/!P;s//\n\1\2,/;D'
对每对进行排序和取消配对<强>输入强>
sort -u
<强>输出强>
1,c11,c12
1,c12,c14,c13
1,c15,c12
2,c21,c22
答案 3 :(得分:0)
感谢所有人,但已经完成了。我写的代码如下:
#!/bin/bash
echo
echo 'Merging files into one'
IFS=","
while read id lname fname dnaid status type program startdt enddt ref email dob age add1 add2 city postal phone1 phone2
do
var="$dnaid,$lname,$fname,$status,$type,$program,$startdt,$enddt,$ref,$email,$dob,$age,$add1,$add2,$city,$postal,$phone1,$phone2"
while read id2 cwlname cwfname
do
if [ $id == $id2 ]
then
var="$var,$cwlname,$cwfname"
fi
done < file2.csv
echo "$var" >> /root/scijoinedfile.csv
done < file1.csv
echo
echo "Merging completed"