提取每个ID的主要和辅助号码并生成JSON字符串

时间:2018-04-24 06:45:49

标签: bash shell perl sed jq

我有两个文件如下所示:一般来说,我将拥有三个以上的客户ID,每个ID的主要和次要号码大约为50-60。

primary.txt

{1=[0, 273, 546, 819], 2=[274, 1, 820], 3=[1016, 275, 821]}

这意味着客户ID [{1}}的主号码为1。与其他ID类似。

secondary.txt

[0, 273, 546, 819]

这意味着客户ID [{1}}具有辅助号码{1=[342, 1102, 608, 684], 2=[115, 191, 837, 1559], 3=[1256, 116]} 。与其他ID类似。

我需要读取这两个文件,为每个ID组合主要和次要号码,并为每个ID创建一个类似于此的JSON字符串。

对于clientid 1

1

对于客户ID 2

[342, 1102, 608, 684]

对于客户ID 3

{"text":"for client id one.","pri":[0, 273, 546, 819],"sec":[342, 1102, 608, 684]}

是否可以在shell脚本或Perl中执行此操作?我只是想在控制台上打印出来。以下是我尝试过但我无法弄清楚如何为每个客户端ID制作正确的JSON。

{"text":"for client id two.","pri":[274, 1, 820],"sec":[115, 191, 837, 1559]}

5 个答案:

答案 0 :(得分:5)

使用Perl。使用正则表达式将数据提取到散列clientid =>中。带数字的字符串。然后通过拆分该字符串获得的arrayref覆盖该值。然后将两个这样的哈希值组合成一个hashref,使用JSON::XS

将其转换为JSON
use warnings;
use strict;
use feature 'say';

use JSON::XS;

my ($f1, $f2) = ('primary.txt', 'secondary.txt');

open my $fh, '<', $f1 or die "Can't open $f1: $!";
my %h1 = <$fh> =~ /(\d+)=\[(.*?)\]/g;
$h1{$_} = [ split /,\s*/, $h1{$_} ]  for keys %h1; 

open $fh, '<', $f2 or die "Can't open $f2: $!";
my %h2 = <$fh> =~ /(\d+)=\[(.*?)\]/g;
$h2{$_} = [ split /,\s*/, $h2{$_} ]  for keys %h2; 
close $fh;

my %client_id = ( 1 => 'one', 2 => 'two', 3 => 'three' );

for (keys %client_id) {
    my $for_json = {
        "text" => "for client id $client_id{$_}.",
        "pri" => $h1{$_},
        "sec" => $h2{$_},
    };
    my $coder = JSON::XS->new;
    my $json = $coder->encode($for_json);
    say $json;
}

如果两者的处理确实总是相同,则将其放在子程序中

my ($f1, $f2) = ('primary.txt', 'secondary.txt');

my %h1 = %{ clientID_nums($f1) };
my %h2 = %{ clientID_nums($f2) };

...

sub clientID_nums {
    my ($file) = @_; 

    open my $fh, '<', $file or die "Can't open $file: $!";
    my %h = <$fh> =~ /(\d+)=\[(.*?)\]/g;
    $h{$_} = [ split /,\s*/, $h{$_} ]  for keys %h; 

    return \%h;
}

直接生成(并打印)JSON字符串

say JSON::XS->new->encode($for_json);

或者更确切地说,使用模块的功能界面

say encode_json $for_json;

默认情况下导出encode_json(并期望并发出UTF-8)。

答案 1 :(得分:1)

这是另一种Perl解决方案。它使用非核心 Lingua::EN::Numbers模块 从二进制转换为英文数字。如果您不想安装此模块,则可以使用简单数组或将数字保留为十进制数字

use strict;
use warnings 'all';
use autodie;

use JSON 'to_json';
use Lingua::EN::Numbers 'num2en';

my %data;

for my $file ( qw/ primary.txt secondary.txt / ) {

    open my $fh, '<', $file;
    local $/;

    for my $item ( split /\]\s*,/, <$fh> ) {
        my ( $key, @values ) = $item =~ /\d+/g;
        push @{ $data{$key} }, \@values;
    }
}

for my $n ( sort { $a <=> $b } keys %data ) {

    my $num = num2en($n);

    my %json = (
        text => "for client id $num",
        pri  => $data{$n}[0],
        sec  => $data{$n}[1],
    );

    print to_json( \%json, { canonical => 1 } ), "\n";
}

输出

{"pri":["342","1102","608","684"],"sec":["342","1102","608","684"],"text":"for client id one"}
{"pri":["115","191","837","1559"],"sec":["115","191","837","1559"],"text":"for client id two"}
{"pri":["1256","116"],"sec":["1256","116"],"text":"for client id three"}

答案 2 :(得分:1)

以下是两个解决方案,一旦primary.txt和secondary.txt文件转换为JSON,每个只需要调用一次jq。

对于转换为JSON,我会假设“=”可以天真地改为“:” - 这个假设使得使用trsed变得微不足道;然后,我将使用any-json完成转换,并将所有内容组合在一起,我将使用bash,但有许多替代方案,特别是hjson作为{{1}的替代方案}}

对于红衣主教的字符串,第一个解决方案假定有一个合适的JSON数组可用,为了便于说明,我假设它可以作为文件:cardinals.json(如下所示)。通用解决方案假定“AP风格”的数字是可以接受的。

any-json

通用解决方案

此通用解决方案假设AP风格(Associated Press)数字可以接受。它允许primary.txt中的键是任何字符串。

#!/bin/bash

for f in primary secondary ; do
      any-json --input-format=hjson <(sed 's/=/:/g' $f.txt) > $f.json
      # ALTERNATIVELY: sed 's/=/:/g' $f.txt | hjson -j > $f.json 
done

jq -s --argfile ids cardinals.json '.[0] as $p | .[1] as $s
   | range(0; $ids|length) as $ix
   | ($ix+1|tostring) as $i
   | select($p | has($i) )
   | {"text": ("for client " + $ids[$ix] + "."),
      "pri": $p[$i],
      "sec": $s[$i] }
' primary.json secondary.json

cardinals.json

以下是输出:

jq -s --argfile ids cardinals.json '
    def apnumber:
      (tonumber? // null) as $i
      | if $i and $i >= 0 and $i < 10                             
        then ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"][$i]
        else .
        end;

   .[0] as $p | .[1] as $s
   | ($p|keys_unsorted[]) as $i
   | {"text": ("for client " + ($i|apnumber) + "."),
      "pri": $p[$i],
      "sec": $s[$i] }
' primary.json secondary.json
ruby -e 'require "humanize"; 1.upto(30){|i| p i.humanize}' | jq -s

答案 3 :(得分:0)

您可以在bash的帮助下在jq中使用以下脚本:

#!/usr/bin/env bash

# convert both files to valid JSON data
sed -E 's/([[:alnum:]_]+)=/"\1":/g' primary.txt > _pri.txt
sed -E 's/([[:alnum:]_]+)=/"\1":/g' secondary.txt > _sec.txt

# loop through keys from primary and extract values from both files
while read k; do
   printf '{"text":"for client id %s.","pri":%s,"sec":%s}\n' "${k//\"/}" \
      "$(jq -c ".$k" _pri.txt | sed 's/,/& /g')" \
      "$(jq -c ".$k" _sec.txt | sed 's/,/& /g')"
done < <(jq 'keys[]' _pri.txt)

# cleanup
rm -f _pri.txt _sec.txt

{"text":"for client id 1.","pri":[0, 273, 546, 819],"sec":[342, 1102, 608, 684]}
{"text":"for client id 2.","pri":[274, 1, 820],"sec":[115, 191, 837, 1559]}
{"text":"for client id 3.","pri":[1016, 275, 821],"sec":[1256, 116]}

答案 4 :(得分:-3)

#!/usr/bin/perl
use strict;
use warnings;
use JSON;

my %hash;

foreach ('primary.txt', 'secondary.txt') {
open FILE, '<', $_ or die "Could not read file $_ because $!\n";
local $/ = undef;
$hash{$_} = decode_json <FILE>;
close FILE
}

foreach (sort keys %{ $hash{'secondary.txt'}}) {
next unless
    (exists $hash{'primary.txt'}->{$_}) &&
    ('ARRAY' eq ref $hash{'primary.txt'}->{$_});

print "{\"text\":\"client id $_.\",\"pri\":[".   join(',',@{$hash{'primary.txt'}->{$_}})  ."],\"sec\":[". join(',',@{$hash{'primary.txt'}->{$_}}) ."]}\n"
}