Question

我有一个名为foo的文件夹。 Foo有一些其他文件夹，可能有子文件夹和文本文件。我想找到以名称年份开头的每个文件并读取其第N行并将其打印到新文件。例如，foo有一个名为year1的文件，子文件夹中有文件名为year2，year3等。程序会将year1的第一行打印到一个名为writeout的文件中，然后将第二行打印到文件写入等。

我也不太了解如何为文件执行for循环。

到目前为止，我有：

#!/bin/bash

for year* in ~/foo
do
  Here I tried writing some code using the sed command but I can't think of something       else.
done

我还在终端收到一条消息，说“年*”不是有效的标识符。有什么想法吗？

Answer 1

Sed可以帮到你。

回想一下，sed通常会处理文件中的所有行并打印文件中的每一行。

您可以关闭该功能，并通过匹配图案或行号仅选择感兴趣的打印行。

因此，要打印文件2的第二行，您可以说

sed -n '2p' file2 > newFile2

要打印第二行然后停止处理，请添加q（用于退出）命令（您还需要大括号将两个命令组合在一起），即

sed -n '2{p;q;}' file2 > newFile2

（如果你正在处理大文件，这可以节省很多时间。）

为了使其更通用，您可以将数字更改为包含数字的变量，即

  lineNo=3
  sed -n "${lineNo}{p;q;}" file3 > newFile3

如果你想让你所有的切片行都进入1个文件，那就使用shell的'append-redirection'，即

 for lineNo in 1 2 3 4 5 ; do
     sed -n  "${lineNo}{p;q;}" file${lineNo} >> aggregateFile
 done

其他帖子，使用find ...的结果来推动您的文件列表，是一种很好的方法。

我希望这会有所帮助。

Answer 2

这是一种方法：

awk "NR==$YEAR" $file

Answer 3

使用find找到您想要的文件，然后sed提取您想要的内容：

find foo -type f -name year* |
while read file; do
    line=$(echo $file | sed 's/.*year\([0-9]*\)$/\1/')
    sed -n -e "$line {p; q}" $file
done

这种方法：

使用find生成名称以字符串“year”开头的文件列表。
将文件列表粘贴到while循环以避免长命令行
使用sed从文件名称中提取所需的行号
使用sed仅打印所需的行，然后立即退出。（您可以省略q，只需编写${line}p即可行，但$file效率可能较低。此外，q可能并非完全支持版本sed。）

但对于名称中包含空格的文件，它无法正常工作。

Answer 4

总是有效的最好方法，只要你提供2个参数：

$ touch myfile
$ touch mycommand
$ chmod +x mycommand
$ touch yearfiles
$ find / -type f -name year* >> yearfiles
$ nano mycommand
$ touch foo

输入：

#/bin/bash
head -n $1 $2 >> myfile
less -n 1 myfile >> foo

使用^X，y，然后输入以进行保存。然后运行mycommand：

$ ./mycommand 2 yearfiles
$ cat foo
year2

假设您的year文件是：

year1, year2, year3

此外，现在您已经进行了设置，从现在开始您只需使用$ ./mycommand LINENUMBER FILENAME。

Answer 5

您的任务有两个子任务：查找所有年份文件的名称，然后提取第N行。请考虑以下脚本：

for file in `find foo -name 'year*'`; do
     YEAR=`echo $file | sed -e 's/.*year\([0-9]*\)$/\1/'`
     head -n $YEAR $file | tail -n 1
done

find调用在目录foo中找到匹配的文件。第二行仅从文件名中提取文件名末尾的数字。然后第三行从文件中提取前N行，仅保留前N行中的最后一行（读取：仅第N行）。

Answer 6

1.time head -5 emp.lst tail -1
It has taken time for execution is
real 0m0.004s
user 0m0.001s
sys 0m0.001s

or

2.awk 'NR==5' emp.lst
It has taken time for execution is
real 0m0.003s
user 0m0.000s
sys 0m0.002s

or 

3.sed -n '5p' emp.lst
It has taken time for execution is
real 0m0.001s
user 0m0.000s
sys 0m0.001s

or 

4.using some cute trick we can get this with cut command
cut -d “
“ -f 5 emp.lst
# after -d press enter ,it means delimiter is newline
It has taken time for execution is
real 0m0.001s

Answer 7

你去吧

sed ${index}'q;d' ${input_file} > ${output_file}

如何读取文件的第N行并将其打印到新文件？

7 个答案: