准备mysql语句时需要打印反斜杠

时间:2016-10-07 05:44:22

标签: linux bash awk sed escaping

我有一个需要转换成mysql插入语句的文件,它的格式如下:

1
'Jan'
'Jame'
'O\'Leary'
'Bill'
''
NULL
1
'Eddie'
'Eddie'
'Unknown'
NULL
NULL
'John'
NULL
'Joseph'

我尝试使用以下代码来准备语句:

COUNTER=1;
echo "insert into database.table values " > full_statements.sql

while read LINE
do
    # There are 16 fields of information per insert statement.
    MODULUS=$(( $COUNTER % 16 ))
    # First piece of information needs to be in format "(value"
    if [ "$COUNTER" -eq 1 ]; then
        printf "("$LINE >> full_statements.sql
    # Last piece of information needs to be in format ",value),"        
    elif [ "$MODULUS" -eq 0 ]; then
        printf ","$LINE")," >> full_statements.sql
    # Interior pieces of information need to be in format ",value"
    else
        printf ","$LINE >> full_statements.sql
    fi

    # If we have a complete insert statement, reset the COUNTER.
    if [ "$COUNTER" -eq 16 ]; then
        COUNTER=1
    else
        ((COUNTER++))
    fi
done < binfile.new

不幸的是我的结果如下:

insert into database.table values
(1,'Jan','Jame','O'Leary','Bill','',NULL,1,'Eddie','Eddie','Unknown',NULL,NULL,'John',NULL,'Joseph');

预期输出:

 insert into database.table values
    (1,'Jan','Jame','O\'Leary','Bill','',NULL,1,'Eddie','Eddie','Unknown',NULL,NULL,'John',NULL,'Joseph');

我用来逃避O&#39; Leary名字中的引用的反斜杠不会打印出来。我正在拉我的头发试图得到这个&#34; \&#34;包含在输出中,但还没有找到答案。

帮助! : - )

5 个答案:

答案 0 :(得分:1)

在awk中。使用模16来识别记录变化(即支持文件中超过16行)和三元运算符在东西之前和之后插入东西:

$ cat program.awk 
{
    printf "%s%s%s",(NR%16==1?"INSERT INTO DATABASE VALUES (":""),$0,(NR%16?",":");"ORS)
}

运行它:

$ awk -f program.awk file
INSERT INTO DATABASE VALUES (1,'Jan','Jame','O\'Leary','Bill','',NULL,1,'Eddie','Eddie','Unknown',NULL,NULL,'John',NULL,'Joseph');

答案 1 :(得分:0)

来自help read

 -r       do not allow backslashes to escape any characters
$ read foo <<< '12\3'
$ echo "$foo"
123
$ read -r foo <<< '12\3'
$ echo "$foo"
12\3

答案 2 :(得分:0)

这感觉就像pr

的任务

假设输入文件包含5行,并且您希望分别按2个元素分组:

$ seq 5 | pr -2ats, | sed 's/.*/\t(&);/'
    (1,2);
    (3,4);
    (5);
  • seq 5示例输入5行
  • pr -2ats,打印最多2个以,
  • 分隔的字段
  • sed 's/.*/\t(&)/'发布pr输出以在文本周围添加(),在行尾添加;,在行前添加tab < / LI>


所以,最后针对您的用例,它将是这样的:

$ ( echo 'insert into database.table values' ; pr -16ats, binfile.new | sed 's/.*/\t(&);/' ) > full_statements.sql

答案 3 :(得分:0)

这应该更简单一些。除了重构之外,实际修复是使用shutil.copyfile选项和-r来防止输入中的反斜杠被解释。使用read是一种很好的做法,可确保逐行读取每一行,而不会删除任何前导或尾随空格。

IFS=

请注意,如果您将单引号放在格式字符串中:

# It's easier to just hard-code the 16 %s rather than build it dynamically.
fmt="insert into database.table values (
%s, %s, %s, %s,
%s, %s, %s, %s,
%s, %s, %s, %s,
%s, %s, %s, %s)\n"

# Read 16 lines, putting each line in a global array
# Return 1 if any read fails
read16 () {
    values=()
    for ((i=0; i<16; i++)); do
        IFS= read -r line || return 1
        values+=("$line")
    done
}

declare -a values
while read16; do
  printf "$fmt" "${values[@]}"
done < binfile.new > full_statements.sql

然后您不需要将它们包含在数据文件中:

fmt="insert ... ('%s', '%s', ...)\n"

答案 4 :(得分:0)

$ awk -v OFS=',' -v ORS=');\n' -v m=16 '{n=(NR-1)%m+1; f[n]=$0} n==m{printf "insert blah\n   ("; for (i=1;i<=m;i++) printf "%s%s", f[i], (i<m ? OFS : ORS)}' file
insert blah
   (1,'Jan','Jame','O\'Leary','Bill','',NULL,1,'Eddie','Eddie','Unknown',NULL,NULL,'John',NULL,'Joseph');