Question

我正在使用Mac OS X Lion。

我有一个文件夹：LITERATURE，结构如下：

LITERATURE > Y > YATES, DORNFORD > THE BROTHER OF DAPHNE:
  Chapters 01-05.txt
  Chapters 06-10.txt
  Chapters 11-end.txt

我想以递归方式连接分成多个文件的章节（并非所有文件都是）。然后，我想将连接文件写入其父级的父目录。连接文件的名称应与其父目录的名称相同。

例如，在运行脚本（在上面显示的文件夹结构中）后，我应该得到以下内容。

LITERATURE > Y > YATES, DORNFORD:
  THE BROTHER OF DAPHNE.txt
  THE BROTHER OF DAPHNE:
    Chapters 01-05.txt
    Chapters 06-10.txt
    Chapters 11-end.txt

在此示例中，父目录为THE BROTHER OF DAPHNE，父目录为YATES, DORNFORD。

[3月6日更新 - 重新提出问题/答案，以便易于查找和理解问题/答案。]

Answer 1

不清楚你的意思是＆＃34;递归＆＃34;但这应该足以让你开始。

#!/bin/bash

titlecase () {  # adapted from http://stackoverflow.com/a/6969886/874188
    local arr
    arr=("${@,,}")
    echo "${arr[@]^}"
}

for book in LITERATURE/?/*/*; do
    title=$(titlecase ${book##*/})
    for file in "$book"/*; do
        cat "$file"
        echo
    done >"$book/$title"
    echo '# not doing this:' rm "$book"/*.txt
done

此循环遍历LITERATURE / 初始 / 作者 / BOOK TITLE 并创建文件Book Title（其中是否应该从每个图书目录中的链接文件添加空格？）（我会在父目录中生成它，然后完全删除book目录，假设它不再包含任何值。）没有递归，只是在这个目录结构上循环。

删除章节文件有点冒险，所以我不在这里做。您可以在第一个echo之后从该行中删除done前缀以启用它。

如果您的书名包含星号或其他shell元字符，则会更加复杂 - title作业假定您可以使用未加引号的书名。

只有带有大小写转换的parameter expansion超出了Bash的基础知识。如果你是一个完整的初学者，array操作也许有点可怕。正确理解报价对新人来说也是一个挑战。

Answer 2

cat Chapters*.txt > FinaleFile.txt.raw
Chapters="$( ls -1 Chapters*.txt | sed -n 'H;${x;s/\
//g;s/ *Chapters //g;s/\.txt/ /g;s/ *$//p;}' )"
mv FinaleFile.txt.raw "FinaleFile ${Chapters}.txt"

cat all txt（假设名称排序列表）
从文件夹的ls中获取章节编号/ ref，并使用sed调整格式
重命名包含章节的连接文件

Answer 3

感谢您的所有投入。他们让我思考，我设法使用以下步骤连接文件：

此脚本使用下划线替换文件名中的空格。

#!/bin/bash

# We are going to iterate through the directory tree, up to a maximum depth of 20.
for i in `seq 1 20`
  do

# In UNIX based systems, files and directories are the same (Everything is a File!).
# The 'find' command lists all files which contain spaces in its name. The | (pipe) …
# … forwards the list to a 'while' loop that iterates through each file in the list.
    find . -name '* *' -maxdepth $i | while read file
    do

# Here, we use 'sed' to replace spaces in the filename with underscores.
# The 'echo' prints a message to the console before renaming the file using 'mv'.
      item=`echo "$file" | sed 's/ /_/g'`
      echo "Renaming '$file' to '$item'"
      mv "$file" "$item"
    done
done

此脚本连接以Part，Chapter，Section或Book开头的文本文件。

#!/bin/bash

# Here, we go through all the directories (up to a depth of 20).
for D in `find . -maxdepth 20 -type d`
do

# Check if the parent directory contains any files of interest.
    if ls $D/Part*.txt &>/dev/null ||
       ls $D/Chapter*.txt &>/dev/null ||
       ls $D/Section*.txt &>/dev/null ||
       ls $D/Book*.txt &>/dev/null
      then

# If we get here, then there are split files in the directory; we will concatenate them.
# First, we trim the full directory path ($D) so that we are left with the path to the …
# … files' parent's parent directory—We will write the concatenated file here. (✝)
        ppdir="$(dirname "$D")"

# Here, we concatenate the files using 'cat'. The 'awk' command extracts the name of …
# … the parent directory from the full directory path ($D) and gives us the filename.
# Finally, we write the concatenated file to its parent's parent directory. (✝)
        cat $D/*.txt > $ppdir/`echo $D|awk -F'/' '$0=$(NF-0)'`.txt
    fi
done

现在，我们删除所有连接的文件，以便其父目录为空。
- find . -name 'Part*' -delete
- find . -name 'Chapter*' -delete
- find . -name 'Section*' -delete
- find . -name 'Book*' -delete

以下命令将删除空目录。（✝）我们将连接文件写入其父级父目录，以便在删除所有拆分文件后将其父目录保留为空。
- find . -type d -empty -delete

[3月6日更新 - 重新提出问题/答案，以便易于查找和理解问题/答案。]

Answer 4

Shell不喜欢名称中的空格。然而，多年来，Unix已经提出了一些有用的技巧：

$ find . -name "Chapters*.txt" -type f -print0 | xargs -0 cat >> final_file.txt

可能做你想做的事。

find以递归方式查找文件树中与查询匹配的所有目录条目（在这种情况下，类型必须是文件，名称与模式Chapter*.txt匹配）。 / p>

通常，find用NL分隔目录条目名称，但-print0表示用NUL字符分隔条目名称。 NL是文件名中的有效字符，但NUL不是。

xargs命令获取find的输出并对其进行处理。 xargs收集所有名称并将它们批量传递给您提供的命令 - 在本例中为cat命令。

通常，xargs按空格分隔文件，这意味着Chapters将是一个文件而01-05.txt将是另一个文件。但是，-0告诉xargs，使用NUL作为文件分隔符 - 这是-print0所做的。

递归连接（连接）和重命名目录树中的文本文件

4 个答案: