Question

我试过这个：

// "postcss" loader applies autoprefixer to our CSS.
// "css" loader resolves paths in CSS and adds assets as dependencies.
// "style" loader turns CSS into JS modules that inject <style> tags.
// In production, we use a plugin to extract that CSS to a file, but
// in development "style" loader enables hot editing of CSS.
{
  test: /\.css$/,
  loader: 'style!css?importLoaders=1!postcss'
},

Answer 1

喜欢这个？：

$ cat > foo
this
nope
$ cat > bar
neither
this

$ sort *|uniq -c
  1 neither
  1 nope
  2 this

并用1s淘汰那些：

... | awk '$1>1'
      2 this

Answer 2

使用uniq和sort来查找重复的行。

#!/bin/bash
dirs=("$@")
for dir in "${dirs[@]}" ; do
    cat "$dir"/*
done | sort | uniq -c | sort -n | tail -n1

uniq -c将在每行之前添加出现次数
sort -n会根据出现次数对行进行排序

tail -n1只输出最后一行，即最大值。如果要查看具有相同重复次数的所有行，请添加以下内容而不是tail：

perl -ane 'if ($F[0] == $n) { push @buff, $_ }
           else { @buff = $_ }
           $n = $F[0];
           END { print for @buff }'

Answer 3

你可以使用awk。如果你只想“计算重复的行数”，我们可以推断你是在“之前出现在同一个文件中的所有行”之后。以下将产生这些计数：

#!/bin/sh

for file in "$@"; do
  if [ -s "$file" ]; then
    awk '$0 in a {c++} {a[$0]} END {printf "%s: %d\n", FILENAME, c}' "$file"
  fi
done

awk脚本首先检查当前行是否存储在数组a中，如果存在，则递增计数器。然后它将该行添加到其数组中。在文件的末尾，我们打印总数。

请注意，这可能会在非常大的文件上出现问题，因为需要将整个输入文件读入阵列的内存中。

示例：

$ printf 'foo\nbar\nthis\nbar\nthat\nbar\n' > inp.txt
$ awk '$0 in a {c++} {a[$0]} END {printf "%s: %d\n", FILENAME, c}' inp.txt
inp.txt: 2

“bar”一词在文件中存在三次，因此有两个重复。

要聚合多个文件，您只需将多个文件提供给awk：

$ printf 'foo\nbar\nthis\nbar\n' > inp1.txt
$ printf 'red\nblue\ngreen\nbar\n' > inp2.txt
$ awk '$0 in a {c++} {a[$0]} END {print c}' inp1.txt inp2.txt
2

为此，“bar”一词在第一个文件中出现两次，在第二个文件中出现一次 - 总共三次，因此我们仍然有两个副本。

我该如何计算每个文件中的重复行？

3 个答案: