Question

我有一个文本文件，我正在尝试提取文件的第一行（或行）中的数据，其中每个数据都保存为一个列表（因此每个点都保存在它自己的行上）在一个新文件中

示例data.txt：

Name  Col  Samp1  Samp2  Samp3  Samp4  Samp5  Samp6
Car1  Red   49.3   43.2   54.3   52.3   12.5   76.8
Car2  Blu   56.3   12.4   85.4   67.1   24.5   32.5
and so on..

我想要一个看起来像这样的新列表，并保存到一个名为samps.txt的新文件中：

Samp1
Samp2
Samp3
Samp4
Samp5
Samp6

我是shell脚本的新手，可以使用任何人都可以提供的帮助。

Answer 1

使用read -a将行读入array，然后使用for迭代数组元素。有关详细信息，请参阅help。

Answer 2

这样可以解决问题：

$ head -1 data.txt | grep -o 'Samp[0-9]*'

Samp1
Samp2
Samp3
Samp4
Samp5
Samp6

<强>解释

显示文件的第一行：head -1 data.txt
|获取最后一个命令的输出并将其用作下一个命令的输入（称为管道）。
打印给定regex的匹配项：grep -o 'Samp[0-9]*'

regex 'Samp[0-9]*'将匹配任何以Samp开头，后跟任意数字的字符串。

要将输出保存到samps.txt，请使用重定向运算符>：

$ head -1 data.txt | grep -o 'Samp[0-9]*' > samps.txt

这适用于任何列标题，而不仅仅是与'Samp[0-9]*'匹配的列标题：

$ head -1 data.txt | grep -o '\w*' | tail -n +3 > samps.txt

grep -o '\w*'匹配单词，tail -n +3显示从第3行开始的所有行（即不显示前两列标题）。

Answer 3

将第一行读入变量

read -r FIRSTLINE < filename

将字符串拆分为单词

WORDS=( $FIRSTLINE )

循环显示单词并将其输出到文件

for WORD in ${WORDS[@]}
do
  echo $WORD >> outputfilename
done

在您的情况下，您希望删除前两列值。您可以在for语句中使用${WORDS[@]:2对数组进行切片。或者，您可以在将它们回显到文件之前测试for循环内的值。

Answer 4

在处理包含字段的文本文件时，您可能会发现awk是一个有价值的工具：

awk 'NR==1 { for(i=3;i<=NF;i++) print $i }' file

结果：

Samp1
Samp2
Samp3
Samp4
Samp5
Samp6

说明：

NR is short for the number of rows.
NF is short for the number of fields in the row.

Answer 5

只使用bash：

set -- $(head -1 data.txt)       # save the words in the first line as $1,$2,...
shift 2                          # discard the first two words
printf '%s\n' "$@" > samps.txt   # print each remaining word on its own line

Answer 6

我赞成Ignacio Vazquez-Abrams的答案，因为它是最好的选择，只使用纯bash。由于他没有提供一个完整的例子，这里有一个：

read -a samps < "myfile.txt"
printf "%s\n" "${samps[@]:2}"

输出：

Samp1
Samp2
Samp3
Samp4
Samp5
Samp6

保存第一行 - Linux Shell Scripting

6 个答案: