Add a double quote to a string if it doesn't start with a double quote

时间:2018-06-04 17:42:25

标签: awk text-processing command-line-tool

I have a text file like this:

1,a,"some strings in a pair of double quotes"
2,b,"more strings in a pair of double quotes"
3,c,some messy strings with only right half double quotes"
4.d,"more strings in a pair of double quotes"

I tried to use awk with sed to add the missing left double quote to the 3rd line:

function addQuote(input) {
 return '"' + input
}    
BEGIN{
   FS=","
}

{
  if ($3~/^"/) s = $3
  else s = addQuote($3)

  print $1,$2,s
}

It seems that the addQuote function is not working but I don't know how to fix it.

I know in sed I can easily add a double quote to begining of a line by doing sed 's/^/"/' line, but I don't know how to make it work together with awk. Please kindly help. Thanks!

2 个答案:

答案 0 :(得分:3)

Following awk may help you here.

awk 'BEGIN{FS=OFS=","} $3 !~ /^"/{$3="\"" $3} 1'  Input_file

OR

awk 'BEGIN{FS=OFS=","} {$3=$3 !~ /^"/?"\"" $3:$3} 1'  Input_file

EDIT: As per sir Jonathan's comments in comment section adding following code which will handle 3 cases now, it should add " is it not on 3rd field completely, it will add " at last of field or starting of field too.

Let's say we have following Input_file:

cat Input_file
1,a,"some strings in a pair of double quotes"
2,b,"more strings in a pair of double quotes"
3,c,some messy strings with only right half double quotes"
4,d,"more strings in a pair of double quotes
4,d,more strings in a pair of double

now following code may cover all 3 mentioned permutations/combinations here:

awk 'BEGIN{FS=OFS=","} {$3=$3 !~ /\"/?"\"" $3 "\"":($3 !~ /^\"/?"\"" $3:($3 !~ /\"$/?$3 "\"":$3))} 1'  Input_file
1,a,"some strings in a pair of double quotes"
2,b,"more strings in a pair of double quotes"
3,c,"some messy strings with only right half double quotes"
4,d,"more strings in a pair of double quotes"
4,d,"more strings in a pair of double"

答案 1 :(得分:1)

addQuote()函数的问题:

function addQuote(input) {
 return '"' + input
}

是:

  1. 字符串分隔符为",而不是',因此您应该使用"\""代替'"'
  2. +是awk中的算术运算符,因此"\"" + input告诉awk将"\""input的内容转换为数字和然后将它们加在一起。您想要的是连接,并且在awk中没有特定的操作符 - 并排的两个字符串连接,例如``" \"" input`。
  3. 所以如果你把你的函数写成:

    function addQuote(input) {
     return ("\"" input)
    }
    

    它做你想做的事。为了便于阅读,我添加了parens。

    话虽如此,这可能是一种更好的方法,因为它涵盖了前面和/或后面缺少的引号,并确保每一行都被重新编译,这对于改变OFS值很重要:从@ RavinderSing13借用输入&回答:

    $ awk 'BEGIN{FS=OFS=","} {gsub(/^"|"$/,"",$3); $3="\"" $3 "\""} 1' file
    1,a,"some strings in a pair of double quotes"
    2,b,"more strings in a pair of double quotes"
    3,c,"some messy strings with only right half double quotes"
    4,d,"more strings in a pair of double quotes"
    4,d,"more strings in a pair of double"