我有一个文本文件(1.txt),其中包含以下信息和格式:
{
"ip": "X.X.XX.8",
"hostname": "No Hostname",
"city": "Kuala Terengganu",
"region": "Terengganu",
"country": "MY",
"loc": "5.3302,103.1408",
"org": "AS4788 TM Net, Internet Service Provider"
}{
"ip": "X.X.XX.143",
"hostname": "No Hostname",
"city": "Kuantan",
"region": "Pahang",
"country": "MY",
"loc": "3.8077,103.3260",
"org": "AS4788 TM Net, Internet Service Provider"
}{
"ip": "X.X.XXX.76",
"hostname": "No Hostname",
"city": "Kuching",
"region": "Sarawak",
"country": "MY",
"loc": "1.5310,110.3442",
"org": "AS4788 TM Net, Internet Service Provider",
"postal": "93700"
}{
"ip": "X.X.XX.158",
"hostname": "No Hostname",
"city": "Seoul",
"region": "Seoul-t'ukpyolsi",
"country": "KR",
"loc": "37.5985,126.9783",
"org": "AS17839 DreamcityMedia"
}{
"ip": "XX.XXX.X.87",
"hostname": "No Hostname",
"city": "Surat",
"region": "Gujarat",
"country": "IN",
"loc": "20.9667,72.9000",
"org": "AS45528 Tikona Digital Networks Pvt Ltd."
}{
"ip": "XXX.XX.XXX.134",
"hostname": "No Hostname",
"city": "Bhandup",
"region": "Maharashtra",
"country": "IN",
"loc": "19.1500,72.9333",
"org": "AS45528 Tikona Digital Networks Pvt Ltd."
}{
我编写了以下perl代码,因此我可以将其输出到逗号分隔文件中:
use FileHandle;
use strict;
main();
sub main() {
my $line_numbers = "";
my $num_matches = 0;
my $first_match = "";
my $count = 0;
my $resource_location = "1.txt";
my $output_fh = FileHandle->new("> 2.txt");
open(FILE, "<", $resource_location) or die "cannot open < $resource_location: $!";
my $output_str = "";
foreach my $line (<FILE>) {
$count++;
my ($ip) = $line =~ /"ip=([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})"/;
my ($hostname) = $line =~ /"hostname:?([^"\s]+)"/;
my ($city) = $line =~ /"city:?([^"\s]+)"/;
my ($region) = $line =~ /"region:?([^"\s]+)"/;
my ($country) = $line =~ /"country:?([^"\s]+)"/;
my ($org) = $line =~ /"org:?([^"\s]+)"/;
print $output_fh "$ip,$hostname,$city,$region,$country,$org\n";
}
print "$count rows processed\n";
close FILE;
$output_fh->close;
}
当我运行脚本时,我得到的是逗号:
,,,,,
,,,,,
,,,,,
,,,,,
,,,,,
,,,,,
预期产出:
"X.X.XX.8","No Hostname","Kuala Terengganu","Terengganu", "MY","AS4788 TM Net, Internet Service Provider"
"X.X.XX.143","No Hostname","Kuantan","Pahang","MY","AS4788 TM Net, Internet Service Provider"
"X.X.XXX.76","No Hostname","Kuching","Sarawak","MY","AS4788 TM Net, Internet Service Provider"
"X.X.XX.158","No Hostname","Seoul","Seoul-t'ukpyolsi","KR","AS17839 DreamcityMedia"
"XX.XXX.X.87","No Hostname","Surat","Gujarat","IN","AS45528 Tikona Digital Networks Pvt Ltd."
我错过了什么?
答案 0 :(得分:2)
的Ack!使用实际的JSON解析器来解析JSON比尝试破解脆弱的,容易出错的解决方案更容易!
好的,你实际上并没有JSON文件,而是一堆端到端的JSON文件。但这没问题; JSON :: XS的增量解析器(incr_parse
)可以处理它。
use open ':std', ':encoding(UTF-8)';
use JSON::XS qw( );
use Text::CSV_XS qw( );
my $json_parser = JSON::XS->new();
my $csv_formatter = Text::CSV_XS->new({ binary => 1, auto_diag => 1 });
while ( my $file = do { local $/; <> } ) {
for my $obj ( $json_parser->incr_parse($file) ) {
my @row = @$obj{qw( ip hostname city region country org )};
$csv_formatter->print(\*STDOUT, \@row);
}
}
用法:
myparser.pl input.json >output.csv
答案 1 :(得分:0)
试试这个。我不知道json和json模块,但简单的代码给出了你期望的输出。
use warnings;
use strict;
open('data',"file");
$/ = "}";
my @ar = <data>;
foreach (@ar){
my @xz = split("\n",$_);
my @ddta;
foreach my $v (@xz){
my @xvz = split(/.*\:(.*)/,$v);
push(@ddta,@xvz);
}
print "@ddta\n\n";
}
您的数据由{
分隔,因此我使用$/ = "}"
。它将数据分成数组。试一试print $ar[0]
。
然后split(/.*\:(.*)/,$v)
对:
的相邻内容进行分组,以便将您的预期输出存储到@xvz
中,然后打印出内部foreach condtion的一面