Question

我有一个包含4位数字的.txt文件。

有时它们只包含一个4位数字，有时是多个4位数字，有时它们是空的。

example1.txt文件：

6304
6204

example2.txt文件：

example3.txt文件：

example4.txt文件：

6300
6204
6301

example5.txt文件：

6302
6234
6345

我需要做的是检查示例文件中的数字是否在我在其他文本文件中的数字列表中。

此列表看起来像这样:( 但有更多数字）

'example1.txt'文件的

*：

数字'6204'应该从文件中删除*（因为它不在列表中。）* 数字'6304'必须保留在示例文件（它在列表中）

'example2.txt'文件的

*：

应删除该号码，文件应为空。

'example3.txt'文件的

*：

该数字保留在示例文件中。

'example4.txt'文件的

*：

示例文件中有多个匹配项。所以一切都应该删除。

'example5.txt'文件的

*：

文件中只应包含6302。应该删除其他两个，因为它们不在列表中。

所以基本上我想保留1个匹配的文件。这些文件应该只包含与列表中的数字匹配的数字。如果匹配超过1，则该文件应为空。如果没有匹配，则文件也应为空

除此之外，我还想用sh脚本来做。

现在我的问题是：

这甚至可能吗？怎么样？或者我需要使用数据库和其他编程语言？

提前致谢。

Answer 1

这当然不是最快的解决方案，但有效：

while read line
do 
    sed -i "s/$line//" example1.txt
done < list_textfile.txt

它会从“要检查的数字”文本文件中删除每行中字符串的每个外观。

<强>更新这没有被问到：上面过滤了list_textfile.txt中的字符串而不是保留它们。

这应该做对了：

grep -o -f list_textfile.txt example1.txt

-o确保输出中只显示匹配的部分
-f允许指定包含grep for

Answer 2

我想我现在已经理解了你的逻辑。我假设您的列表存储在文件list.txt中，并将以下内容保存为marksscript：

#!/bin/bash
#
# First count total number of matches and store in variable MATCHES
#
MATCHES=0
while read WORD
do
   # Count number of matches for this word
   N=$(grep -c $WORD list.txt)
   [ $N -eq 1 ] && MATCHEDWORD=$WORD
   echo DEBUG: $WORD $N
   ((MATCHES+=N))
done < "$1"

#
# Now we know total number of matches, decide what to do
#
echo DEBUG: Total matches $MATCHES

if [ $MATCHES -ne 1 ]; then
    echo DEBUG: Zero out file - not exactly ONE match
    > "$1"
else
    echo DEBUG: $MATCHEDWORD remains as singleton match
    echo $MATCHEDWORD > "$1"
fi

像这样跑：

chmod +x marksscript
./marksscript example1.txt

<强>输出

./go example1
DEBUG: 6204 0
DEBUG: 6304 1
DEBUG: Total matches 1
DEBUG: 6304 remains as singleton match

./go example2
DEBUG: Total matches 0
DEBUG: Zero out file - not exactly ONE match

./go example3
DEBUG: 6305 1
DEBUG: Total matches 1
DEBUG: 6305 remains as singleton match

./go example4
DEBUG: 6300 1
DEBUG: 6204 0
DEBUG: 6301 1
DEBUG: Total matches 2
DEBUG: Zero out file - not exactly ONE one match

将文本文件中的数字与另一个文本文件中的数字列表进行比较

2 个答案: