Question

我正在尝试解析一个巨大的文本文件，比如200mb。

文本文件包含一些字符串

所以我的脚本看起来像

while read line ; do
echo "$line"
done <textfile

但是使用上述方法，我的字符串" 12345"会被截断为"12345"

我尝试使用

sed -n "$i"p textfile

但吞吐量从每秒27行减少到0.2行，这是不可接受的; - ）

任何想法如何解决这个问题？

Answer 1

你想要在没有fieldsep的情况下回显这些行：

while IFS="" read line; do
    echo "$line"
done <<< " 12345"

如果您还想跳过特殊字符的解释，请使用

while IFS="" read -r line; do
    echo "$line"
done <<< " 12345"

您可以在没有双引号的情况下编写IFS：

while IFS= read -r line; do
    echo "$line"
done <<< " 12345"

Answer 2

这似乎就是你要找的东西：

while IFS= read line; do
echo "$line"
done < textfile

最安全的方法是使用read -r与仅read进行比较，这将跳过对特殊字符的解释（感谢Walter A）：

while IFS= read -r line; do
echo "$line"
done < textfile

Answer 3

选项1：

#!/bin/bash

# read whole file into array
readarray -t aMyArray < <(cat textfile)

# echo each line of the array
# this will preserve spaces
for i in "${aMyArray[@]}"; do echo "$i"; done

readarray - 从标准输入读取行
-t - 省略尾随换行符
aMyArray -
＆LT; ＆lt;（） - 执行命令;将stdout重定向到数组
cat textfile - 要存储在变量中的文件
for in in＆＃34; $ {aMyArray [@]}＆＃34; - 对于aMyArray中的每个元素
＆＃34;＆＃34; - 需要维护元素中的空格
$ {[@]} - 引用数组
做回声＆＃34; $ i＆＃34 ;; - 对于＆＃34; $ i＆＃34;的每次迭代回应它
＆＃34;＆＃34; - 维持可变空间
$ i - 等于数组aMyArray的每个元素，因为它循环
完成 - 关闭循环

选项2：

为了容纳更大的文件，您可以这样做，以帮助减轻工作量并加快处理速度。

#!/bin/bash

sSearchFile=textfile
sSearchStrings="1|2|3|space"

while IFS= read -r line; do

    echo "${line}"

done < <(egrep "${sSearchStrings}" "${sSearchFile}")

这会在通过while命令循环之前对文件进行grep（更快）。让我知道这对你有什么用。请注意，您可以向$ sSearchStrings变量添加多个搜索字符串。

选项3：

以及一个解决方案，以获得包含搜索条件的文本文件以及其他所有组合...

#!/bin/bash

# identify file containing search strings
sSearchStrings="searchstrings.file"

while IFS= read -r string; do

# if $sSearchStrings empty read in strings
    [[ -z $sSearchStrings ]] && sSearchStrings="${string}"
# if $sSearchStrings not empty read in $sSearchStrings "|" $string
    [[ ! -z $sSearchStrings ]] && sSearchStrings="${sSearchStrings}|${string}"

# read search criteria in from file
done <"${sSearchStrings}"

# identify file to be searched
sSearchFile="text.file"

while IFS= read -r line; do

    echo "${line}"

done < <(egrep "${sSearchStrings}" "${sSearchFile}")

bash while while＆＃34;吃＆＃34;我的太空人物

3 个答案: