我希望能够将任何命令行程序的输出传递给将其转换为json的命令。
例如,我的未知程序可以接受目标列,分隔符和输出字段名称
# select columns 1 and 3 from the output and convert it to simple json
netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other
并会产生类似的东西:
[
{"name": "tcp4", "other": "31"},
{"name": "tcp4", "other": "0"}
...
]
netstat
!编辑:我当然愿意传递任何需要使其工作的信息,例如分隔符或多个分隔符 - 我只是想避免在命令行中进行显式解析,并且让工具做到这一点。
答案 0 :(得分:5)
由于终端是一种非常特殊的界面,使用等宽字体, 为了监控这个终端而构建的工具,很多输出都可以 难以解析:
netstat
输出是一个很好的示例:
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags Type State I-Node Path
unix 2 [ ACC ] STREAM LISTENING 13947569 @/tmp/.X11-unix/X1
unix 2 [ ] DGRAM 8760 /run/systemd/notify
unix 2 [ ACC ] SEQPACKET LISTENING 8790 /run/udev/control
如果某些行包含空白字段,则无法在空格中进行简单分割。
因此,请求文件脚本convert_to_json
将在此底部发布。
awk
使用awk
,您可以使用不错的语法:
netstat -an |
awk '/CLOSE_WAIT/{
printf " { \42%s\42:\42%s\42,\42%s\42:\42%s\42},\n","name",$1,"other",$3
}' |
sed '1s/^/[\n/;$s/,$/\n]/'
perl
进行简单的基于空格的分割,但使用json库但这种perl方式更灵活:
netstat -an | perl -MJSON::XS -ne 'push @out,{"name"=>,$1,"other"=>$2} if /^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;END{print encode_json(\@out)."\n";}'
或相同但分裂;
netstat -an |
perl -MJSON::XS -ne '
push @out,{"name"=>,$1,"other"=>$2} if
/^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;
END{print encode_json(\@out)."\n";
}'
或漂亮的:
netstat -an | perl -MJSON::XS -ne '
push @out,{"name"=>,$1,"other"=>$2} if /^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;
END{$coder = JSON::XS->new->ascii->pretty->allow_nonref;
print $coder->encode(\@out);}'
最后,我喜欢这个版本不基于regex:
netstat -an | perl -MJSON::XS -ne '
do {
my @line=split(/\s+/);
push @out,{"name"=>,$line[0],"other"=>$line[2]}
} if /CLOSE_WAIT/;
END{
$coder = JSON::XS->new->ascii->pretty->allow_nonref;
print $coder->encode(\@out);
}'
但你可以在perl脚本中运行命令:
perl -MJSON::XS -e '
open STDIN,"netstat -an|";
my @out;
while (<>){
push @out,{"name"=>,$1,"other"=>$2} if /^(\S+)\s+\d+\s+(\d+)\s.*CLOSE_WAIT/;
};
print encode_json \@out;'
这可能成为一个基本的原型:
#!/usr/bin/perl -w
use strict;
use JSON::XS;
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;
$ENV{'LANG'}='C';
open STDIN,"netstat -naut|";
my @out;
my @fields;
my $searchre=":";
$searchre = shift @ARGV if @ARGV;
while (<>){
map { s/_/ /g;push @fields,$_; } split(/\s+/) if
/^Proto.*State/ && s/\sAddr/_Addr/g;
do {
my @line=split(/\s+/);
my %entry;
for my $i (0..$#fields) {
$entry{$fields[$i]}=$line[$i];
};
push @out,\%entry;
} if /$searchre/;
}
print $coder->encode(\@out);
(如果没有参数,这将转储整个netstat -uta
,但您可以将任何搜索字符串作为参数,例如 CLOSE 或IP。)
netstat2json.pl
此方法可以与netcat
以外的许多其他工具一起使用
更正:
#!/usr/bin/perl -w
use strict;
use JSON::XS;
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;
$ENV{'LANG'}='C';
open STDIN,"netstat -nap|";
my ( $searchre ,@out,%fields)=( "[/:]" );
$searchre = shift @ARGV if @ARGV;
while (<>){
next if /^Active\s.*\)$/;
/^Proto.*State/ && do {
s/\s(name|Addr)/_$1/g;
my @head;
map { s/_/ /g;push @head,$_; } split(/\s+/);
s/_/ /g;
%fields=();
for my $i (0..$#head) {
my $crt=index($_,$head[$i]);
my $next=-1;
$next=index($_,$head[$i+1])-$crt-1 if $i < $#head;
$fields{$head[$i]}=[$crt,$next];
}
next;
};
do {
my $line=$_;
my %entry;
for my $i (keys %fields) {
my $crt=substr($line,$fields{$i}[0],$fields{$i}[1]);
$crt=~s/^\s*(\S(|.*\S))\s*$/$1/;
$entry{$i}=$crt;
};
push @out,\%entry;
} if /$searchre/;
}
print $coder->encode(\@out);
Proto.*State
(特定于netcat
)这可以使用参数运行,如前所述:
./netstat2json.pl CLOS
[
{
"Local Address" : "127.0.0.1:31001",
"State" : "CLOSE_WAIT",
"Recv-Q" : "18",
"Proto" : "tcp",
"Send-Q" : "0",
"Foreign Address" : "127.0.0.1:55938",
"PID/Program name" : "-"
},
{
"Recv-Q" : "1",
"Local Address" : "::1:53816",
"State" : "CLOSE_WAIT",
"Send-Q" : "0",
"PID/Program name" : "-",
"Foreign Address" : "::1:631",
"Proto" : "tcp6"
}
]
空场不会破坏变量分配:
./netstat2json.pl 1000.*systemd/notify
[
{
"Proto" : "unix",
"I-Node" : "33378",
"RefCnt" : "2",
"Path" : "/run/user/1000/systemd/notify",
"PID/Program name" : "-",
"Type" : "DGRAM",
"Flags" : "[ ]",
"State" : ""
}
]
Nota! 此修改后的版本使用netstat
参数运行-nap
以获取 PID/Program name
字段
如果没有超级用户 root
运行,您可以在 STDERR
上成为此输出:
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
你可以避免它们
netstat2json.pl 2>/dev/null
,root
或sudo
或#6
,为"netstat -nap|"
更改"netstat -na|"
。convert_to_json.pl
perl脚本将 STDIN 转换为json。 convert_to_json.pl
perl脚本严格按照要求运行:以netstat -an | grep CLOSE | ./convert_to_json.pl 1,3 name,other
#!/usr/bin/perl -w
use strict;
use JSON::XS;
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;
my (@fields,@pos,@out);
map {
push @pos,1*$_-1
} split ",",shift @ARGV;
map {
push @fields,$_
} split ",",shift @ARGV;
die "Number of fields don't match number of positions" if $#fields ne $#pos;
while (<>) {
my @line=split(/\s+/);
my %entry;
for my $i (0..$#fields) {
$entry{$fields[$i]}=$line[$pos[$i]];
};
push @out,\%entry;
}
print $coder->encode(\@out);
答案 1 :(得分:4)
这是我的红宝石版本:
#! /usr/bin/env ruby
#
# Converts stdin columns to a JSON array of hashes
#
# Installation : Save as convert_to_json, make it executable and put it somewhere in PATH. Ruby must be installed
#
# Examples :
#
# netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other
# ls -l | convert_to_json
# ls -l | convert_to_json 6,7,8,9
# ls -l | convert_to_json 6,7,8,9 month,day,time,name
# convert_to_json 1,2 time,value ";" < some_file.csv
#
#
# http://stackoverflow.com/questions/40246134/convert-arbitrary-output-to-json-by-column-in-the-terminal
require 'json'
script_name = File.basename(__FILE__)
syntax = "Syntax : command_which_outputs_columns | #{script_name} column1_id,column2_id,...,columnN_id column1_name,column2_name,...,columnN_name delimiter"
if $stdin.tty? or $stdin.closed? then
$stderr.puts syntax
else
if ARGV[2]
delimiter = ARGV[2]
$stderr.puts "#{script_name} : Using #{delimiter} as delimiter"
else
delimiter = /\s+/
end
column_ids = (ARGV[0] || "").split(',').map{|column_id| column_id.to_i-1}
column_names = (ARGV[1] || "").split(',')
results = []
$stdin.each do |stdin_line|
if column_ids.empty?
values = stdin_line.strip.split(delimiter)
else
values = stdin_line.strip.split(delimiter).values_at(*column_ids)
end
line_hash=Hash.new
values.each_with_index.each{|value,i|
colum_name = column_names[i] || "column#{(column_ids[i] || i)+1}"
line_hash[colum_name]=value
}
results<<line_hash
end
puts JSON.pretty_generate(results)
end
它的工作原理如下所示:
netstat -a | grep CLOSE_WAIT | convert_to_json 1,3 name,other
[
{
"name": "tcp",
"other": "0"
},
{
"name": "tcp6",
"other": "0"
}
]
作为奖励,你可以
其他例子:
netstat -a | grep CLOSE_WAIT | ./convert_to_json
# [
# {
# "column1": "tcp",
# "column2": "1",
# "column3": "0",
# "column4": "10.0.2.15:51074",
# "column5": "123.45.101.207:https",
# "column6": "CLOSE_WAIT"
# },
# {
# "column1": "tcp6",
# "column2": "1",
# "column3": "0",
# "column4": "ip6-localhost:50293",
# "column5": "ip6-localhost:ipp",
# "column6": "CLOSE_WAIT"
# }
# ]
netstat -a | grep CLOSE_WAIT | ./convert_to_json 1,3
# [
# {
# "column1": "tcp",
# "column3": "0"
# },
# {
# "column1": "tcp6",
# "column3": "0"
# }
# ]
ls -l | tail -n3 | convert_to_json 6,7,8,9 month,day,time,name
# [
# {
# "month": "Oct",
# "day": "27",
# "time": "10:35",
# "name": "test.dot"
# },
# {
# "month": "Nov",
# "day": "2",
# "time": "14:27",
# "name": "uniq.rb"
# },
# {
# "month": "Nov",
# "day": "2",
# "time": "14:27",
# "name": "utf8_nokogiri.rb"
# }
# ]
# NOTE: ls -l uses the 8th column for year, not time, for older files :
ls --full-time -t /usr/share/doc | tail -n3 | ./convert_to_json 6,7,9 yyyymmdd,time,name
[
{
"yyyymmdd": "2013-10-21",
"time": "15:15:20.000000000",
"name": "libbz2-dev"
},
{
"yyyymmdd": "2013-10-10",
"time": "16:27:32.000000000",
"name": "zsh"
},
{
"yyyymmdd": "2013-10-03",
"time": "18:52:45.000000000",
"name": "manpages-dev"
}
]
ls -l | tail -n3 | convert_to_json 9,12
# [
# {
# "column9": "test.dot",
# "column12": null
# },
# {
# "column9": "uniq.rb",
# "column12": null
# },
# {
# "column9": "utf8_nokogiri.rb",
# "column12": null
# }
# ]
convert_to_json 1,2 time,value ";" < some_file.csv
# convert_to_json : Using ; as delimiter
# [
# {
# "time": "1",
# "value": "3"
# },
# {
# "time": "2",
# "value": "5"
# }
# ]
答案 2 :(得分:3)
我发现a great list of tools可以使用命令行输出,列出的工具之一是sqawk,它会将任意数据转换为json,让你使用类似查询的sql过滤它!
ps
输出转换为JSON ps | sqawk -output json,indent=1 'select PID,TTY,TIME,CMD from a' trim=left header=1
[{
"PID" : "3947",
"TTY" : "pts/2",
"TIME" : "00:00:07",
"CMD" : "zsh"
},{
"PID" : "15951",
"TTY" : "pts/2",
"TIME" : "00:00:00",
"CMD"
}]
答案 3 :(得分:2)
这个方法的一个非常基本的概念并非完全严格或功能齐全,但可能会让您了解如何使用netstat -a | grep CLOSE_WAIT | awk 'BEGIN{print "["} {print " {\"name\": \"",$1,"\", \"other\": \"",$2,"\"}"} END{print "]"}' OFS=""
[
{"name": "tcp4", "other": "31"}
{"name": "tcp4", "other": "31"}
{"name": "tcp4", "other": "31"}
{"name": "tcp4", "other": "31"}
{"name": "tcp4", "other": "0"}
{"name": "tcp4", "other": "31"}
]
完成大部分操作,因为您似乎没有还有!
awk
我知道它不会在行末端做逗号,我知道它不带参数 - 但这两个都是可以解决的。
参数的一个想法是将它们传递给awk -v fields="1:4:7" -v headings="name:other:fred" '{...}'
,如下所示:
split()
然后BEGIN
echo hi | awk -v fields="1:3:5" -v headings="HeadA:HeadB:HeadC" 'BEGIN{split(headings,h,":"); split(fields,f,":")} {for(i in h)print h[i],f[i];}'
HeadA 1
HeadB 3
HeadC 5
部分中的那些并在主循环中迭代它们。这看起来像这样:
{{1}}
答案 4 :(得分:2)
这是我的python版本:
#!/usr/bin/env python3
import json
import re
def all_columns_to_json (column_dict, columns_line):
json_object = {}
for column_index, column_value in enumerate(columns_line):
if column_index in column_dict:
column_name = column_dict[column_index]
else:
column_name = str(column_index)
json_object[column_name] = column_value
return json_object
def filter_columns_in_dict_to_json(column_dict, columns_line):
'''Parse columns_line, make sure every element in column_dict
exists there, filter elements that are not in column_dict from
columns_line, and convert it to a dict.
'''
json_object = {}
for column_index, column_name in column_dict.items():
try:
json_object[column_name] = columns_line[column_index]
except IndexError as err:
# columns_line doesn't has column_index.
raise ValueError('Invalid table line ({}) : no {} element.'.format(columns_line,
column_index)) from err
return json_object
def columns_line_to_json (column_dict, columns_line, should_filter_colunms):
'''Parse a list of values to a json object with special names.
'''
if should_filter_colunms:
return filter_columns_in_dict_to_json(column_dict, columns_line)
else:
return all_columns_to_json(column_dict, columns_line)
def regex_from_delims_list(delims_list):
'''Get a regex compiled pattern from a delims list'''
one_characters_delims = ''
final_pattern = ''
for delim in delims_list:
delim_and_maybe_min_max = delim.split(':')
escaped_delim = re.escape(delim_and_maybe_min_max[0])
# Check if this is a delim without min count.
if len(delim) == 1:
final_pattern += "%s{1,}|" % (escaped_delim)
elif len(delim) == 2:
min_and_maybe_max = delim_and_maybe_min_max[1].split('-')
current_pattern = escaped_delim
# Add count to the regex (only min or max too)
if len(min_and_maybe_max) == 2:
current_pattern += '{%d,%d}' % (int(min_and_maybe_max[0],
int(min_and_maybe_max[1])))
else:
current_pattern += '{%d,}' % (int(min_and_maybe_max[0]))
final_pattern += current_pattern + '|'
else:
raise ValueError("Invalid ':' count in the delimiter argument")
# If there are one character delims without count, add them. If not
# Remove the last OR ('|').
final_pattern = final_pattern[:-1]
return re.compile (final_pattern)
def main(args):
column_dict = {}
# Split the user's argument by a comma, and parse each columns
# seperatly.
for column_and_name in args.columns_and_names.split(','):
# Split the name from the columns.
column_and_name = column_and_name.split('=')
if len(column_and_name) > 2:
raise ValueError("Invalid column: {}".format(str(column_and_name())))
# If there is not name, set it to the column index.
if len(column_and_name) == 1:
column_and_name.append (str(column_and_name[0]))
# Try to convert the column index is it isn't '*'
if column_and_name[0] != '*':
try:
column_and_name[0] = int(column_and_name[0])
except ValueError as err:
raise ValueError('Invalid column index: {} (not an integer)'.format(column_and_name[0])) from err
# Add this column definition.
column_dict[column_and_name[0]] = column_and_name[1]
# Check if column_dict has the '*' member.
# If it does, we will print all of the columns (even ones that
# are not in column_dict)
should_filter_colunms = ('*' not in column_dict)
# We have checked it, no need for it now.
if not should_filter_colunms:
del column_dict['*']
# Parse the delim list into a regex pattern.
strip_regex_pattern = regex_from_delims_list(args.delimiters)
json_objects_list = []
for fd in args.infiles:
for line in fd:
# Convert bytes object to string.
if isinstance(line, bytes):
line = line.decode('utf-8')
# Strip the \n in the end of the line.
line = line.rstrip('\n')
# Split the line by the delims.
splitted_line = re.split(strip_regex_pattern, line)
json_objects_list.append (columns_line_to_json (column_dict, splitted_line, should_filter_colunms))
print(json.dumps (json_objects_list))
def comma_list(string):
'''Convert a comma list '1,2,3,4' to a list
[1,2,3,4] with escaping of , by a one \\ char'''
# Split the string by commas after non-\ chars.
splitted_string = re.split('(?!\\\).,', re.escape(string))
replaced_string = []
# Replace '\,' with ',' and '\\' with '\'.
for string in splitted_string:
string = string.replace ('\\\\', '\\')
string = string.replace ('\\\\,', ',')
replaced_string.append (string)
return replaced_string
if __name__ == '__main__':
import argparse
from sys import stdin
parser = argparse.ArgumentParser()
parser.add_argument('columns_and_names', help='The columns and its names to print out (format: n=name)', default='*')
parser.add_argument('--delim', '-d', type=comma_list,
help='A list of input columns delimiters. Format: delim[:min[-max]]. Where `min` and `max` are the numbers of times `delim` should repeat. As default min=1 and max is not set. Enter "\," for the delimiter "," and "\\\\"" for "\\"',
default=(' ', '\t'),
metavar='delim[:min-max]')
parser.add_argument('infiles', type=argparse.FileType('rb'), default=(stdin,), metavar='file', nargs='*')
main(parser.parse_args())
(有关更多用法示例,请参阅https://github.com/Reflexe/convert_table_to_json)
我试图寻找类似的程序,但我找不到任何东西,所以我不得不写它(我认为这是一个非常有用的工具)。
例如,要将其与netstat一起使用,请使用:
$ netstat -a | grep ESTABLISHED | ./convert_to_json.py '2=name,3=other'
答案 5 :(得分:0)
可能BASH不是最好的平台。但是,我提供了一个半生不熟的解决方案,你需要一些技巧。
#!/bin/bash
function procline {
IFS=' ' list=($1)
echo -n "{ first_column: \"" ${list[0]} "\","
echo "{ second_column: \"" ${list[1]} "\" }},"
}
tr -s " " | eval \
'while IFS= read -r line; do procline "$line"; done'
一些解释:
tr
截断空格while IFS
...将结果逐行传递给procline
函数procline
函数首先拆分该行,然后创建有点JSON。 我的意见是,tr
的输出应传递给另一种用其他语言编写的脚本,例如: Python,PHP-CLI等,如果不是全部。它看起来很容易处理:
tcp4 0 0 192.168.99.1.56358 192.168.99.100.32111 CLOSE_WAIT
tcp4 31 0 192.168.100.179.56129 server-54-192-20.https CLOSE_WAIT