我正在使用XML:twig使用Perl从XML文件中提取一些属性;
这是我的代码
import java.io.PrintWriter;
import java.util.Scanner;
import java.io.*;
class OneStandard {
public static void main(String[] args) throws IOException {
Scanner input1 = new Scanner(new File("ClaimProportion.txt"));//reads in claim dataset txt file
Scanner input2 = new Scanner(new File("ClaimProportion.txt"));
Scanner input3 = new Scanner(new File("ClaimProportion.txt"));
//this while loop counts the number of lines in the file
while (input1.hasNextLine()) {
NumClaim++;
input1.nextLine();
}
System.out.println("There are "+NumClaim+" different claim sizes in this dataset.");
int[] ClaimSize = new int[NumClaim];
System.out.println(" ");
System.out.println("The different Claim sizes are:");
//This for loop put the first column into an array
for (int i=0; i<NumClaim;i++){
ClaimSize[i] = input2.nextInt();
System.out.println(ClaimSize[i]);
input2.nextLine();
}
double[] ProportionSize = new double[NumClaim];
//this for loop is trying to put the second column into an array
for(int j=0; j<NumClaim; j++){
input3.skip("20");
ProportionSize[j] = input3.nextDouble();
System.out.println(ProportionSize[j]);
input3.nextLine();
}
}
}
这打印出以下结果:
use XML::Twig;
my $file = $ARGV[0];
$file =~ /(.+)\.xml/;
my $outfile = $1 . ".snp" ;
open my $out, '>', $outfile or die "Could not open file '$outfile' $!";
my $twig = XML::Twig->new(
twig_handlers => {
'Rs/MergeHistory' => \&MergeHistory,
}
);
$twig -> parsefile( "$file");
sub MergeHistory {
my ($twig, $elt) = @_;
print $out "\t";
print $out "rs";
print $out $elt->att('rsId'), ",";
print $out "b";
print $out $elt->att('buildId'), ",";
}
我想要的是将每个MergeHistory rsId和buildId打印在一起,如下所示:
rs56546490,b130, rs386588736,b142
rs56546490,b130, rs386588736,b142
以下是XML文件的一部分,其中包含两个MergeHistory标记:
rs56546490,rs386588736, b130,b142
rs56546490,rs386588736, b130,b142
答案 0 :(得分:-1)
twig_handlers
适用于预处理XML,尤其适用于丢弃它。
它可能不是你想要的东西 - 它看起来像你要做的是:
所以考虑到这一点 - 我认为你可能想要的是findnodes
和children
。
my $twig = XML::Twig->parsefile( $file );
foreach my $rs ( $twig->findnodes('//Rs') ) {
print join( ",",
map { "rs" . $_->att('rsId') } $rs->children('MergeHistory') ),
"\t";
print join( ",",
map { "b" . $_->att('buildId') } $rs->children('MergeHistory') ),
"\n";
}
根据您的样本,打印出来:
rs56546490,rs386588736 b130,b142
哪个看起来大概是你想要的?
findnodes
来迭代Rs
个元素。 children
来获取MergeHistory
元素。map
提取属性并连接前面的b
或rs
字符串。join
以逗号分隔合并。(如果你愿意,你仍然可以使用twig_handlers进行上述操作,通过触发&#34; Rs&#34;处理器)