在特定字符串之间替换文本

时间:2013-09-05 20:21:00

标签: regex bash sed awk grep

我正在尝试编写一个脚本来帮助我进行语言学实验。此实验向主题显示文本短语,他们需要逐字阅读短语。例如,假设我有以下短语:

The girl was upset with her boyfriend.

我需要将这个短语分成几个小部分,以便仅向将要进行实验的主体显示这些小部分。显示主题短语的软件采用以下输入:

The ---- --- ----- ---- --- ----------
--- girl --- ----- ---- --- ----------
--- ---- was ----- ---- --- ----------
--- ---- --- upset ---- --- ----------
--- ---- --- ----- with --- ----------
--- ---- --- ----- ---- her ----------
--- ---- --- ----- ---- --- boyfriend.

请注意,完整的短语绝不是输入。我需要将小部件提供给软件,以便在计算机屏幕上显示短语。此外,屏幕上没有出现的单词必须更改为短划线,长度与原始单词的长度相同。

我正在考虑使用其中一种bash工具,比如sed,grep,awk等来解决我的问题。例如,我可以将原始短语写为

The | girl | was | upset | with | her | boyfriend.

复制七次,对于每个副本,使用破折号作为我不需要的单词。请注意,单词总是在两个“|”之间,以便于识别它们。

(事实上,有时候我需要替换的不仅仅是单词。例如,我可以一次性替换“女孩”)

关于如何做到这一点的任何想法?

2 个答案:

答案 0 :(得分:6)

如果有帮助的话,请看这个awk one-liner:

awk '{for(i=1;i<=NF;i++){t=$0;w=$i;gsub(/\S/,"-");$i=w;print;$0=t}}' file

用你的例子测试:

kent$  cat f
The girl was upset with her boyfriend.
Yes @Kent, you are right. – grandeabobora 6 mins ago

kent$  awk '{for(i=1;i<=NF;i++){t=$0;w=$i;gsub(/\S/,"-");$i=w;print;$0=t}}' f
The ---- --- ----- ---- --- ----------
--- girl --- ----- ---- --- ----------
--- ---- was ----- ---- --- ----------
--- ---- --- upset ---- --- ----------
--- ---- --- ----- with --- ----------
--- ---- --- ----- ---- her ----------
--- ---- --- ----- ---- --- boyfriend.
Yes ------ --- --- ------ - ------------- - ---- ---
--- @Kent, --- --- ------ - ------------- - ---- ---
--- ------ you --- ------ - ------------- - ---- ---
--- ------ --- are ------ - ------------- - ---- ---
--- ------ --- --- right. - ------------- - ---- ---
--- ------ --- --- ------ – ------------- - ---- ---
--- ------ --- --- ------ - grandeabobora - ---- ---
--- ------ --- --- ------ - ------------- 6 ---- ---
--- ------ --- --- ------ - ------------- - mins ---
--- ------ --- --- ------ - ------------- - ---- ago

答案 1 :(得分:1)

Pure bash解决方案:

#!/bin/bash

data='The girl was upset with her boyfriend.'
dashed="${data//[^ ]/-}"

IFS=' ' read -ra dataArray <<< "$data"
IFS=' ' read -ra dashedArray <<< "$dashed"
for ((i=0; i < ${#dataArray[@]}; i++)); do
    if ((i == 0)); then
        echo "${dataArray[i]} ${dashedArray[@]:i+1}"
    else
        echo "${dashedArray[@]:0:i} ${dataArray[i]} ${dashedArray[@]:i+1}"
    fi
done

没有if语句的更复杂的解决方案(如果你能理解它就是你的男人!):

#!/bin/bash

data='The girl was upset with her boyfriend.'
dashed="${data//[^ ]/-}"

IFS=' ' read -ra dashedArray <<< "$dashed"
IFS=' ' read -ra dataArray <<< "$data"
size=${#dataArray[@]}
for ((i=0; i < size; i++)); do
    echo "${dashedArray[@]:0:i}${dashedArray[size-i]+ }${dataArray[i]} ${dashedArray[@]:i+1}"
done