Question

I have hundreds of files containing like the below. In my sqlldr ' is the text qualifier and my files are rejected due to text like Wegman's , which contains apostrophe in the text itself.

Using sed/awk is there a way to find such string and replace 's with ` tick or something?

t2.txt';'20160707071500';'478251533';'TWN';'20160620160801';'1';'691891-2';'2';'0';'Employer';'1';'OMCProcessed';'Wegman's Food Market';'';'Wegman's Food Markets';'14411364807'

One solution I thought is, find text that is not equal to '; but not sure how to put to use.

Answer 1

也许这里sed是更好的选择

$ sed -r 's/([^;])(\x27)([^;])/\1\2\2\3/g' file

't2.txt';'20160707071500';'478251533';'TWN';'20160620160801';'1';'6918912';'2';'0';'Employer';'1';'OMCProcessed';'Wegman''s Food Market';'';'Wegman''s Food Markets';'14411364807'

Answer 2

在SQL中转义单引号的常用方法是将它们加倍，但您可以修改对gsub的调用，将其替换为您喜欢的任何内容。

可能有更好的方法可以做到这一点，但在这里我只是从每个字段中删除了封闭的引号，替换了内部引号，然后再分配回原始字段，并再次包含引号。

$ cat m.txt
't2.txt';'20160707071500';'478251533';'TWN';'20160620160801';'1';'691891-2';'2';'0';'Employer';'1';'OMCProcessed';'Wegman's Food Market';'';'Wegman's Food Markets';'14411364807'

$ cat m.awk
BEGIN { FS=OFS=";" }
{
    for (i=1; i<=NF; ++i) {
        f = substr($i,2,(length($i) - 2))
        gsub("'", "''", f)
        $i = "'" f "'";
    }
}1

$ awk -f m.awk m.txt
't2.txt';'20160707071500';'478251533';'TWN';'20160620160801';'1';'691891-2';'2';'0';'Employer';'1';'OMCProcessed';'Wegman''s Food Market';'';'Wegman''s Food Markets';'14411364807'

Find replace string the doesn't matching the specific pattern

2 个答案: