如何用双引号将前10个逗号括起来?

时间:2019-03-27 21:31:30

标签: bash csv sed

这是我的input.csv文件

dealerid,address,city,state,zip,vin,stocknumber,type,color,year,make,model,trim,bodystyle,fueltype,mileage,transmission,interiorcolor,interiorfabric,price,titlestatus,warranty,options_text,cylinders,engine,engineaspiration,enginetext,drivetrain,transmissiontext,mpgcity,mpghighway,features_text,vdc_url,images
TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922218113,298,Used,Red,2002,OLDSMOBILE,BRAVADA,,,,136000,AUTOMATIC,,,2200,Clear,Available,"This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.",,,,,,,,,,https://www.example.com/listings/298,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922123453,307,Used,Brown,2008,HONDA,599,,,,217538,AUTOMATIC,,,3500,Clear,Available,"This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.",,,,,,,,,,https://www.example.com/listings/211,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"  

我需要用双引号将所有列包装起来,所以最终得到一个像这样的文件:

"dealerid","address","city","state","zip","vin","stocknumber","type","color","year","make","model","trim","bodystyle","fueltype","mileage","transmission","interiorcolor","interiorfabric","price","titlestatus","warranty","options_text","cylinders","engine","engineaspiration","enginetext","drivetrain","transmissiontext","mpgcity","mpghighway","features_text","vdc_url","images"
"TS06095298","999 wanna Road,Windsor","CT","06095","22HDT13S922218113","298","Used","Red","2002","OLDSMOBILE","BRAVADA","","","","136000,AUTOMATIC","","","2200","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/298","https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
"TS06095298","999 wanna Road,Windsor","CT","06095","22HDT13S922123453","307","Used","Brown","2008","HONDA","599","","","","217538","AUTOMATIC","","","3500","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/211","https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"

整个文件非常稳定,某些列中缺少相同的数据。

图片列和功能文本列已经包装好了。

看到相同的信息丢失了,我决定在每行的开头添加双引号,并开始用双引号替换逗号,但是开始遇到一些问题。

这是我到目前为止所拥有的。我知道代码不是很有效,但这只是一个开始。

#!/bin/bash

#- Temp Directories
tmp_dir="$(mktemp -d -t 'csv.XXXXX' || mktemp -d 2>/dev/null)"
tmp_input1="${tmp_dir}/temp_input1.csv"
tmp_input2="${tmp_dir}/temp_input2.csv"
tmp_input3="${tmp_dir}/temp_input3.csv"

#- Variables
client="00000"
wDir="$(pwd)"
ftpDir="${wDir}/.clientftp"
clientDir="${ftpDir}/${client}"
csvFile="${clientDir}/final.csv"
inputCsv="${wDir}/input.csv"

#  Lets Begin
cd "$wDir" || exit

      cp "$inputCsv" "$tmp_input1"
      dos2unix "$tmp_input1"

      #  place first line to a temp file , surrounding commas with double quotes , adding double quotes to the front and end of line
      head -1 "$tmp_input1" | sed -e 's/,/","/g;s/.*/"&"/' > "$tmp_input2"

      #  place remainding lines to a temp file
      sed 1,1d "$tmp_input1" | sed "s/^/\"/" > "$tmp_input3"
      sed -i 's/",,,,,,,,,,https/","","","","","","","","","","https/g' "$tmp_input3"
      sed -i 's/,Clear,Available,"/","Clear","Available","/g' "$tmp_input3"
      sed -i 's/,,,,/","","","","/g' "$tmp_input3"
      sed -i 's/,,,/","","","/g' "$tmp_input3"

      #  Create final file
      cat "$tmp_input2" > "$csvFile"
      cat "$tmp_input3" >> "$csvFile"

      rm -rf "$tmp_dir"

      { clear; echo ""; echo "";  echo "nano $csvFile"; echo ""; }

nano "$csvFile"

此脚本产生:

"dealerid","address","city","state","zip","vin","stocknumber","type","color","year","make","model","trim","bodystyle","fueltype","mileage","transmission","interiorcolor","interiorfabric","price","titlestatus","warranty","options_text","cylinders","engine","engineaspiration","enginetext","drivetrain","transmissiontext","mpgcity","mpghighway","features_text","vdc_url","images"
"TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922218113,298,Used,Red,2002,OLDSMOBILE,BRAVADA","","","","136000,AUTOMATIC","","","2200","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/298,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
"TS06095298,999 wanna Road,Windsor,CT,06095,22HDT13S922123453,307,Used,Brown,2008,HONDA,599","","","","217538,AUTOMATIC","","","3500","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/211,"https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"

所以现在我有几个问题:
1- vdc_url列没有右引号
2-前10个逗号需要用双引号引起来

最后一列可以包含3张以上图片

任何帮助将不胜感激。

2 个答案:

答案 0 :(得分:1)

我喜欢ruby进行快速CVS转换:

ruby -rcsv -e '
    out = CSV.instance($stdout, {force_quotes: true})
    CSV.foreach(ARGV.shift) {|row| out << row}
' input.csv

确保任何行上都没有尾随空格。

csvkit也是一个很好的解决方案。

答案 1 :(得分:0)

使用GNU awk进行FPAT:

$ awk -v FPAT='[^,]*|"[^"]*"' -v OFS=',' '
    { for (i=1;i<=NF;i++) {gsub(/^"|"$/,"",$i); $i="\"" $i "\""} }
1' file
"dealerid","address","city","state","zip","vin","stocknumber","type","color","year","make","model","trim","bodystyle","fueltype","mileage","transmission","interiorcolor","interiorfabric","price","titlestatus","warranty","options_text","cylinders","engine","engineaspiration","enginetext","drivetrain","transmissiontext","mpgcity","mpghighway","features_text","vdc_url","images"
"TS06095298","999 wanna Road","Windsor","CT","06095","22HDT13S922218113","298","Used","Red","2002","OLDSMOBILE","BRAVADA","","","","136000","AUTOMATIC","","","2200","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/298","https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"
"TS06095298","999 wanna Road","Windsor","CT","06095","22HDT13S922123453","307","Used","Brown","2008","HONDA","599","","","","217538","AUTOMATIC","","","3500","Clear","Available","This vehicle is offered for sale by a verified private seller and features: FREE vehicle history & title report. Original window sticker available. Seller`s identity, email and phone verified. Secure cashless transactions. No cash needed. Pay securely by debit card or ACH. Bill of Sale and receipt issued for completed transactions. Vehicle financing options may be available.","","","","","","","","","","https://www.example.com/listings/211","https://www.example.com/rails/00008.jpg,https://www.example.com/rails/AM00010.jpg"