Question

目录中的文件如下所示：

A_1_email.txt
A_1_phone.txt
A_2_email.txt
A_2_phone.txt
B_1_email.txt
B_1_phone.txt
B_2_email.txt
B_2_phone.txt

我想要的是什么：
合并文件A_1_email.txt和A_1_phone.txt;合并文件B_1_email.txt和B_1_phone.txt等。
我的意思是：如果第一个文件名的标志匹配（例如A到A; 1到1）而不是合并文件。

我是如何尝试这样做的：

ls * | cut -d "_" -f 1-2  | sort | uniq -c | awk '{print $2}' > names && for name in   
$(cat names); do

我迷失在这里，真的不知道我该怎么走。

Answer 1

以下内容基于@ MichaelJ.Barber的回答（其中有使用join的绝佳主意），但其具体意图是避免使用dangerous practice of parsing the output of ls：

# Simple loop: avoids subshells, pipelines
for file in *_email.txt; do
    if [[ -r "$file" && -r "${file%_*}_phone.txt" ]]; then
        join "$file" "${file%_*}_phone.txt"
    fi
done

或

##
# Use IFS and a function to avoid contaminating the global environment.
joinEmailPhone() {
    local IFS=$'\n'
    local -x LC_COLLATE=C # Ensure glob expansion sorting makes sense.
    # According to `man (1) bash`, globs expand sorted "alphabetically".
    # If we use LC_COLLATE=C, we don't need to sort again.
    # Use an awk test (!seen[$0]++) to ensure uniqueness and a parameter expansion instead of cut
    awk '!seen[$0]++{ printf("join %s_email.txt %s_phone.txt\n", $1, $1) }' <<< "${*%_*}" | sh
}
joinEmailPhone *

但是很可能（再次假设LC_COLLATE=C）你可以逃脱：

printf 'join %s %s\n' * | sh

Answer 2

我假设这些文件都有以制表符分隔的名称 - 值对，其中值是电子邮件或电话。如果情况并非如此，请进行一些预先排序或根据需要进行其他修改。

ls *_{email,phone}.txt |
  cut -d "_" -f1-2 | # could also do this with variable expansion
    sort -u | 
      awk '{ printf("join %s_email.txt %s_phone.txt\n", $1, $1) }' |
        sh

这样做是为了识别文件的唯一前缀，并使用'awk'生成用于连接对的shell命令，然后将其传送到sh以实际运行命令。

Answer 3

您可以在给定方案中使用printf '%s\n' *_{email,phone}.txt | ...代替ls *-...，即。即文件路径名中没有新行字符。至少有一个外部命令少！

Answer 4

使用for循环使用read命令迭代电子邮件文件使用适当的IFS值将文件名拆分为必要的部分。请注意，这确实使用bash提供的一个非POSIX功能，即使用here-string（<<<）将$email的值传递给read命令。

for email in *_email.txt; do
    IFS=_ read fst snd <<< $email
    phone=${fst}_${snd}_phone.txt
    # merge $email and $phone
done

循环遍历要根据名称合并的文件列表

4 个答案: