Question

我有2个文件，一个是telnet命令的输出，另一个是ip和主机文件。我想基于通用ip合并这两个文件。

文件1：

25-08-2019_22.00.03 : Port port1 of URL http://ip1:port1/ is [ NOT OPEN ]
25-08-2019_22.00.03 : Port port2 of URL http://ip2:port2/ is [ NOT OPEN ]

文件2为：

http://ip1:port1/cs/personal, CS
http://ip2:port2/cs/Satellite/out/, CS_SAT

和一个想要的输出文件，如下所示：

25-08-2019_22.00.03 : Port port1 of URL http://ip1:port1/ is [ NOT OPEN ] : CS
25-08-2019_22.00.03 : Port port2 of URL http://ip2:port2/ is [ NOT OPEN ] : CS_SAT

我不是Linux方面的专家，非常感谢您的帮助。

我尝试了加入-o文件file2之类的加入选项，但是它没有提供所需的输出。

我尝试了awk命令，例如为第一个文件创建键值对并在第二个文件上运行，但是它没有给出任何输出，是因为文件中存在分隔符还是特殊字符？

awk 'FNR==NR{a[$2]=$1;next}{if(a[$1]==""){a[$1]=0};
    print $1,$2,a[$1]}' file1 file2

Answer 1

嗯，您必须以某种方式预处理输入文件。首先，使用一些带有sed的正则表达式提取公共字段，然后使用join。之后，您将转换输出以符合您的期望。

代码注释：

# recreate input
cat <<EOF >file1
25-08-2019_22.00.03 : Port port1 of URL http://ip1:port1/ is [ NOT OPEN ]
25-08-2019_22.00.03 : Port port2 of URL http://ip2:port2/ is [ NOT OPEN ]
EOF
cat <<EOF >file2
http://ip1:port1/cs/personal, CS
http://ip2:port2/cs/Satellite/out/, CS_SAT
EOF

# join on the first field
join -t' ' -11 -21 <(
  # extract the part inside `http://<here>/` and put it in front of the line
  # ip1:port1 25-08-2019_22.00.03 : Port port1 of URL http://ip1:port1/ is [ NOT OPEN ]
  <file1 sed -r 's@^(.*http://([^/]*).*)$@\2 \1@' | sort
) <(
  # extract the part inside `http://<here>/` and remove all the we are not interested in
  # ip1:port1 CS
  <file2 sed -r 's@http://([^/]*)/.*, (.*)@\1 \2@' | sort
) |
# ip1:port1 25-08-2019_22.00.03 : Port port1 of URL http://ip1:port1/ is [ NOT OPEN ] CS
# remove the leading ip1:port1
cut -d' ' -f2- |
# replace the trailing ` CS` with ` : CS`
sed 's/[^ ]*$/: &/'

Answer 2

使用cut和paste：

paste -d " : " file1 <(cut -s -d ',' -f2 file2)

这将每行的两个文件行都连接在一起，中间是:。

第二个文件被修改为仅包含基于逗号,的行的第二部分。

根据同一列合并两个文件

2 个答案: