我搜索过已经被问过的问题,但找不到与我试图解决的内容相匹配的问题。
我在Mac上,使用终端。我希望这可以作为另一个用bash编写的脚本的一部分运行。
我有一个包含单个列的CSV文件。在每个"标题下#34;将根据输出包含不同数量的设备。标题(SerialNumber,DeviceName,PurchaseDate)将始终保持不变。
SerialNumbers
A1B2C3D4E5F6
SASIUWOI9828
I3I6K36H78SK
设备名称
这个有一个简短的名字
这个名字长
这个具有中等名称
而purchaseDate
2016年2月19日
2016年2月1日
2016年2月12日
期望的输出
SerialNumbers,设备名称,而purchaseDate
A1B2C3D4E5F6,这个有一个简称,2016-02-19
SASIUWOI9828,这个有一个很长的名字,2016-02-01
I3I6K36H78SK,这个有中等名称,2016-02-12
这是我的源文件,如果有帮助
https://www.dropbox.com/s/wapjqbi1v3oah3p/tobecorrected.csv?dl=0
答案 0 :(得分:1)
我不确定您的操作系统中是否存在pr
,但这是最简单的方法
$ pr -3ts, file
SerialNumbers,DeviceName,PurchaseDate
A1B2C3D4E5F6,This one has a short name,2016-02-19
SASIUWOI9828,This one has a long name,2016-02-01
I3I6K36H78SK,This one has a medium name,2016-02-12
答案 1 :(得分:0)
假设标题始终以相同的顺序显示,您可以使用以下脚本convert.sh
:
#!/bin/bash
C1="`awk '/SerialNumbers/{flag=1}/DeviceName/{flag=0}flag' $1`"
C2="`awk '/DeviceName/{flag=1}/PurchaseDate/{flag=0}flag' $1`"
C3="`awk '/PurchaseDate/,0' $1`"
paste <(echo "$C1") <(echo "$C2") <(echo "$C3") --delimiters ','
示例:
./convert.sh test.txt
输出:
SerialNumbers,DeviceName,PurchaseDate
A1B2C3D4E5F6,This one has a short name,2016-02-19
SASIUWOI9828,This one has a long name,2016-02-01
I3I6K36H78SK,This one has a medium name,2016-02-12
答案 2 :(得分:0)
这个awk将以任何顺序处理标题,并在标题后面加上可变长度数据:
awk '
/SerialNumbers/ {sn=1; dn=0; pd=0}
/DeviceName/ {sn=0; dn=1; pd=0}
/PurchaseDate/ {sn=0; dn=0; pd=1}
sn==1 {snl[++snc]=$0}
dn==1 {dnl[++dnc]=$0}
pd==1 {pdl[++pdc]=$0}
END{
max=snc>dnc?snc:dnc;
max=pdc>max?pdc:max;
for (i=1;i<=max;i++)
print snl[i]","dnl[i]","pdl[i]
}' file
修改的
鉴于您可以执行example file:
awk '/^[[:alnum:]]+:/ {sub(/:/,""); idx=$0; arr[idx]=$0; next}
{arr[idx]=arr[idx]","$1}
END{
for (id in arr) print arr[id]}' file.txt | rs -c',' -C',' -T | sed 's/,$//'
打印:
serialNumber,bluetoothAddress,wifiAddress,enclosureColor,totalDiskCapacity
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQF,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQG,0.214583,0.214583,#b4b5b9,1585
DMPQG,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQG,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
如果您的字段包含空格,请将{arr[idx]=arr[idx]","$1}
替换为:
{
sub(/^[[:space:]]+/,"")
sub(/[[:space:]]+$/,"")
arr[idx]=arr[idx]","$0
}
然后打印:
serialNumber,bluetoothAddress,wifiAddress,enclosureColor,totalDiskCapacity
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQF,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQG,0.214583,0.214583,#b4b5b9,1585
DMPQG,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583 B59,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQG,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
DMPQD,0.214583,0.214583,#b4b5b9,1585
(注意添加B59的较长行)
答案 3 :(得分:0)
只是为了变化,这是一个不使用awk
的解决方案。请注意,您需要在输入文件中使用尾随换行符才能正确输出,我假设标题及其顺序事先已知(否则第一个if
语句将需要更改)。
#!/bin/bash
filename="$1"
declare -a arr=("SerialNumbers" "DeviceName" "PurchaseDate")
declare -A output
col=0
while read -r line
do
if [[ "${arr[$col]}" == "$line" ]]; then # header
col=$((col+1))
row=1
output[$((row-1)),$((col-1))]=$line
else
output[$row,$((col-1))]=$line
row=$((row+1))
fi
done < "$filename"
# print results
for ((i=0;i<row;i++)) do
for ((j=0;j<col;j++)) do
printf "${output[$i,$j]}"
if (( j < col-1)); then
printf ","
fi
done
echo
done
输出:
$ ./script.sh example.txt
SerialNumbers,DeviceName,PurchaseDate
A1B2C3D4E5F6,This one has a short name,2016-02-19
SASIUWOI9828,This one has a long name,2016-02-01
I3I6K36H78SK,This one has a medium name,2016-02-12