您正在使用perl脚本将Big xml拆分为小块。我已经审阅了这个链接 Split file by XML tag
我的代码就像这样
if($line =~ /^</row>/)
{
$count++;
}
但是我得到了这个错误
works\filesplit.pl line 20.
Bareword found where operator expected at E:\Work\perl works\filesplit.pl line 2
0, near "/^</row"
(Missing operator before row?)
syntax error at E:\Work\perl works\filesplit.pl line 20, near "/^</row"
Search pattern not terminated at E:\Work\perl works\filesplit.pl line 20.
任何人都可以帮助我
更新
<row>
<date></date>
<ForeignpostingId />
<country>11</country>
<domain>http://www.xxxx.com</domain>
<domainid>20813</domainid>
</row>
<row>
<date></date>
<ForeignpostingId />
<country>11</country>
<domain>http://www.xxxx.com</domain>
<domainid>20813</domainid>
</row>
<row>
<date></date>
<ForeignpostingId />
<country>11</country>
<domain>http://www.xxxx.com</domain>
<domainid>20813</domainid>
</row>
答案 0 :(得分:3)
答案 1 :(得分:2)
如果您尝试在该行的开头匹配^<\/row>
,则需要</row>
。这是我的测试代码。
#!/usr/bin/perl
use strict;
use warnings;
my $line = "</row> something";
if ($line =~ /^<\/row>/)
{
print "found a match \n";
}
输出:
# perl test.pl
found a match
<强>更新强>
在OP提供样本数据后发布此更新。
你的正则表达式中需要^\s+<\/row>
,因为并非所有这些都是从行的开头开始的。其中一些人面前有one space
。因此,在进行实际匹配之前,我们需要在线的起点匹配零个或多个空格。
代码:
#!/usr/bin/perl -w
use strict;
use warnings;
while (my $line = <DATA>)
{
if ($line =~ /^\s+<\/row>/)
{
print "found a match \n";
}
}
__DATA__
<row>
<date></date>
<ForeignpostingId />
<country>11</country>
<domain>http://www.xxxx.com</domain>
<domainid>20813</domainid>
</row>
<row>
<date></date>
<ForeignpostingId />
<country>11</country>
<domain>http://www.xxxx.com</domain>
<domainid>20813</domainid>
</row>
<row>
<date></date>
<ForeignpostingId />
<country>11</country>
<domain>http://www.xxxx.com</domain>
<domainid>20813</domainid>
</row>
<强>输出:强>
# perl test.pl
found a match
found a match
found a match
答案 2 :(得分:2)
也许以下内容会有所帮助:
use strict;
use warnings;
my $i = 1;
local $/ = '<row>';
while (<>) {
chomp;
s!</row>!! or next;
open my $fh, '>', 'File_' . ( sprintf '%05d', $i++ ) . '.xml' or die $!;
print $fh $_;
}
用法:perl script.pl inFile.xml
这会将Perl的记录分隔符$/
设置为<row>
,以便在<row>
分隔的那些“块”中读取xml文件。它从块中删除</row>
,然后将该块写出到具有“File_nnnnn.xml”命名方案的文件。
答案 3 :(得分:0)
#!/bin/perl -w
## splitting xml files using perl script
print "Input File ? ";
chomp($XmlFile = <STDIN>);
open $XmlFileHandle,'<',$XmlFile;
print "\nSplit By which Tag ? ";
chomp($splitby = <STDIN>);
open $OutputHandle, '>','OutputFile_'.$splitby;
## to split by <user>...</user>
while(<$XmlFileHandle>){
if(/<$splitby>/){
print $OutputHandle "<$splitby>\n";
last;
}
}
while(<$XmlFileHandle>){
$line = $_;
if($line =~ m/<\/$splitby>/){
print $OutputHandle "</$splitby>";
last;
}
print $OutputHandle $line;
}
print "\nOutput File is : OutputFile_$splitby\n";