我正在使用Mac 10.9.5,bash shell和perl 5,版本16,subversion 3(v5.16.3)。我有以下脚本......
#!/bin/bash
perl -pi -e "s/([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?),([^,]+?)/REPLACE INTO student (ID, SIS_ID, STUDENT_NUM, USER_ID, OTHER_USER_ID) VALUES (REPLACE(uuid(), '-', ''), '\$24', '\$26', '\$2', '\$27');/g" $1
但是,当我针对文件运行脚本时......
sh myscript.sh ~/Downloads/myfile.csv
以上只针对文件的第一行运行,而不是针对文件中的每一行,尽管文件有数千行......
davea$ wc -l ~/Downloads/myfile.csv
91552 /Users/davea/Downloads/myfile.csv
如何调整以上内容以便将搜索和替换应用于文件的每一行?
编辑:这是我作为输入传入的文件示例
app.app.first_name,app.app.id,app.app.last_name,app.app.max_time,app.app.url,app.app.user_name,thirdparty.created,thirdparty.district,thirdparty.dob,thirdparty.ell_status,thirdparty.email,thirdparty.frl_status,thirdparty.gender,thirdparty.grade,thirdparty.hispanic_ethnicity,thirdparty.iep_status,thirdparty.last_modified,thirdparty.location.zip,thirdparty.name.first,thirdparty.name.last,thirdparty.name.middle,thirdparty.race,thirdparty.school,thirdparty.sis_id,thirdparty.state_id,thirdparty.student_number,thirdparty.id,matchmaker_result
FirstName,0040FBA053464647BD51141EECF4437F,LastName,2014-09-15 20:46:11,cityunifiedca.springboardonline.org,mlastname,2014-04-04T23:03:29.916Z,51e76ab1d93412f47b000c32,6/12/2000,,,Paid,F,10,Y,Y,2015-08-19T21:33:13.989Z,90033-1803,FIRSTNAME,LASTNAME,A,Caucasian,51f811478a86244d2900033f,061200F010,6124939964,061200F010,533f3a412a1f1fea24c8e164,match
以下是运行上述
的输出 REPLACE INTO student (ID, SIS_ID, STUDENT_NUM, USER_ID, OTHER_USER_ID) VALUES (REPLACE(uuid(), '-', ''), 'thirdparty.sis_id', 'thirdparty.student_number', 'app.app.id', 'thirdparty.id');atchmaker_result
FirstName,0040FBA053464647BD51141EECF4437F,LastName,2014-09-15 20:46:11,cityunifiedca.springboardonline.org,mlastname,2014-04-04T23:03:29.916Z,51e76ab1d93412f47b000c32,6/12/2000,,,Paid,F,10,Y,Y,2015-08-19T21:33:13.989Z,90033-1803,FIRSTNAME,LASTNAME,A,Caucasian,51f811478a86244d2900033f,061200F010,6124939964,061200F010,533f3a412a1f1fea24c8e164,match
答案 0 :(得分:2)
提供输入文件的路径作为第一个命令行参数。
注意:数组索引可能已关闭,因为我只是将你的正则表达式匹配变量并将它们向下移一(即,我没有测试此代码)。
use strict;
use warnings;
use Text::CSV;
my $csv = Text::CSV->new({ binary => 1 }) or die Text::CSV->error_diag;
open(my $fh, '<', $ARGV[0]) or die $!;
while (my $row = $csv->getline($fh)) {
print "REPLACE INTO student (ID, SIS_ID, STUDENT_NUM, USER_ID, OTHER_USER_ID) VALUES (REPLACE(uuid(), '-', ''), '$row->[23]', '$row->[25]', '$row->[1]', '$row->[26]');\n";
}
$csv->eof or $csv->error_diag;
close($fh);
答案 1 :(得分:1)
让我们首先将脚本修复为Perl脚本,单行代码用于命令行。
#!/usr/bin/perl
# example code from `man perlrun`
use warnings;
use strict;
my $extension = '.orig';
my $oldargv;
my $backup;
my $subre = "([^,]+?)";
my $bigre = "$subre," x 27 . $subre;
my $presub = "REPLACE INTO student (ID, SIS_ID, STUDENT_NUM, USER_ID, OTHER_USER_ID) VALUES (REPLACE(uuid(), '-', '')";
LINE: while (<>) {
if ($ARGV ne $oldargv) {
if ($extension !~ /\*/) {
$backup = $ARGV . $extension;
} else {
($backup = $extension) =~ s/\*/$ARGV/g;
}
rename($ARGV, $backup);
open(ARGVOUT, ">$ARGV");
select(ARGVOUT);
$oldargv = $ARGV;
}
s/$bigre/$presub, '\$24', '\$26', '\$2', '\$27');/g;
} continue {
print; # this prints to original filename
}
select(STDOUT);
然后,看看那个正则表达式,可能有一行包含,,
的空字段,所以...你可以修复正则表达式,但是使用一个是错误的。让我们将这一行从上面改为:
my @f = split /,/;
$_ = $presub . ", '${f[23]}', '${f[25]}', '${f[1]}', '${f[26]}');"
这假设没有包含,
的字段成为引用字段或转义字段。对于所有你使用Text :: CSV的人,如Matt Jacob所示。我有类似的警告。
或者,如果必须,您可以坚持使用正则表达式,但删除g
修饰符,锚定行,并允许空捕获的组。
s/^([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?),([^,]*?)$/REPLACE INTO student (ID, SIS_ID, STUDENT_NUM, USER_ID, OTHER_USER_ID) VALUES (REPLACE(uuid(), '-', ''), '\$24', '\$26', '\$2', '\$27');/;
如果您从替换的引用中删除了mg
,那么在regex101.com中这不会超时并且在为示例输入提供标记$
时有效捕获的字段。
或修改上面第一个更改这些行的脚本:
my $subre = "([^,]*?)";
my $bigre = '^' . "$subre," x 27 . $subre . '$';
...
s/$bigre/$presub, '\$24', '\$26', '\$2', '\$27');/;
答案 2 :(得分:1)
您的s///
似乎只匹配第一行。不知道为什么。然而,这是一个荒谬的正则表达式。您希望将逗号分成列表
perl -F, -lane '
BEGIN { $t="REPLACE INTO student (ID, SIS_ID, STUDENT_NUM, USER_ID, OTHER_USER_ID) VALUES (REPLACE(uuid(), \047-\047, \047\047), \047%s\047, \047%s\047, \047%s\047, \047%s\047);\n"; }
printf $t, $F[23], $F[25], $F[1], $F[26];
' file
REPLACE INTO student (ID, SIS_ID, STUDENT_NUM, USER_ID, OTHER_USER_ID) VALUES (REPLACE(uuid(), '-', ''), 'thirdparty.sis_id', 'thirdparty.student_number', 'app.app.id', 'thirdparty.id');
REPLACE INTO student (ID, SIS_ID, STUDENT_NUM, USER_ID, OTHER_USER_ID) VALUES (REPLACE(uuid(), '-', ''), '061200F010', '061200F010', '0040FBA053464647BD51141EECF4437F', '533f3a412a1f1fea24c8e164');