为什么awk“not in”数组就像awk“in”数组一样工作?

时间:2012-06-06 23:37:51

标签: awk gawk

这是一个awk脚本,它尝试根据第一列设置两个文件的差异:

BEGIN{
    OFS=FS="\t"
    file = ARGV[1]
    while (getline < file)
        Contained[$1] = $1
    delete ARGV[1]
    }
$1 not in Contained{
    print $0
}

这是TestFileA:

cat
dog
frog

这是TestFileB:

ee
cat
dog
frog

但是,当我运行以下命令时:

gawk -f Diff.awk TestFileA TestFileB

我得到输出就好像脚本包含“in”:

cat
dog
frog

虽然我不确定“not in”是否是我的意图的正确语法,但我很好奇为什么它的行为与我写“in”时的行为完全相同。

5 个答案:

答案 0 :(得分:27)

我找不到关于element not in array的任何doc

尝试!(element in array)


我猜:awknot视为未初始化的变量,因此not被评估为空字符串。

$1 not == $1 "" == $1

答案 1 :(得分:16)

我想出了这个。 (数组中的x)返回一个值,所以要做“不在数组中”,你必须这样做:

if ( x in array == 0 )
   print "x is not in the array"

或在您的示例中:

($1 in Contained == 0){
   print $0
}

答案 2 :(得分:1)

不确定这是否与你想做的一样。

#! /bin/awk
# will read in the  second arg file and make a hash of the token
# found in column one. Then it will read the first arg file and print any
# lines with a token in column one not matching the tokens already defined
BEGIN{
    OFS=FS="\t"
    file = ARGV[1]
    while (getline  < file)
        Contained[$1] = $1
#    delete ARGV[1]  # I don't know what you were thinking here
#    for(i in Contained) {print Contained[i]} # debuging, not just for sadists
    close (ARGV[1])
}
{
   if ($1 in  Contained){} else { print $1 }
}

答案 3 :(得分:1)

在我针对此问题的解决方案中,我使用以下import matplotlib.pyplot as plt import numpy as np from numpy.random import vonmises # generate complex circular distribution data = np.r_[vonmises(-1, 5, 1000), vonmises(2, 10, 500), vonmises(3, 20, 100)] # plot data histogram fig, axes = plt.subplots(2, 1) axes[0].hist(data, 100) # plot kernel density estimates x, kde = vonmises_kde(data, 20) axes[1].plot(x, kde) 语句:

if-else

答案 4 :(得分:0)

在awk命令行中,我使用:

 ! ($1 in a)
$1 pattern
a array

示例:

awk 'NR==FNR{a[$1];next}! ($1 in a) {print $1}' file1 file2