Question

我有两个文件

file1.txt
variable1=string1 with spaces characters
variable2=string2 with spaces characters
variable3=string3 with spaces characters

file2.txt
sometext1 textvariable1 sometext2 sometext3
variable2 sometext4 sometextcharactersvariable1 charactersvariable3 sometext5 variable2
....................

我希望file2.txt将所有变量替换为file1.txt

中的值

我尝试了多个grep和awk命令，python代码读取每个单词，比较和替换读取文件选项

fgrep -w -o -f "file1.txt" "file2.txt"
awk '/PATTERN/{system("cat file1");next}1' file2

但这只适用于一个变量。我知道我必须循环遍历file2中的每个单词并将其与file1进行比较然后替换。但不确定如何。

预期产出：

sometext1 string1 sometext2 sometext3
string2 sometext4 string1 string3 sometext5 string2
................

Answer 1

这是使用Python做你想做的事的一种方法。这是用Python2和Python3测试的。

import sys
import re

with open('file1.txt') as fp:
    subs = dict(l.strip().split('=', 1) for l in fp)

with open('file2.txt') as fp:
    text = fp.read()

text = re.sub('|'.join(subs), lambda m: subs[m.group(0)], text)

sys.stdout.write(text)

注意：

请注意使用fp.read()读取整个文本文件。如果这会消耗太多内存，则可以改为for line in fp:。
请注意使用dict理解和生成器表达式来创建subs字典。从本质上讲，它将每一行拆分为variable和string组件并创建字典。
请注意使用'|'.join()创建正则表达式。最终得到'string1|string2|string3'。如果file1.txt是不受控制的输入，那么您可能会遇到问题：如果用户在输入文件中输入.*=bar该怎么办？
请注意使用callable作为re.sub()的第二个参数。这允许运行任意代码。特别是，它允许在subs字典中查找。

Answer 2

awk 解决方案：

awk 'NR==FNR{ a[$1]=$2; next }
     { for(i=1;i<=NF;i++) if ($i in a) $i=a[$i] }1' FS='=' file1.txt FS=' ' file2.txt

输出：

sometext1 string1 sometext2 sometext3
string2 sometext4 string1 string3 sometext5 string2

Answer 3

如果第一个文件小到可以舒适地进入内存，您可以将第一个文件读入字典并使用fileimport模块：https://docs.python.org/3.6/library/fileinput.html?highlight=fileinput#fileinput

以下是在Python3中测试的

import fileinput
with open('file1.txt', 'r') as file:
    my_map = {}
    for line in file:
        key, value = line.strip().split('=')
        my_map[key] = value

for line in fileinput.input('file2.txt', inplace=True):
    new_line = line
    for key, value in my_map.items():
        new_line = new_line.replace(key, value)
    print(new_line,)

使用另一个文件中的值替换文件的变量

3 个答案: