我正在寻找一个快速的Bash脚本,将英国/新西兰的拼写转换为TeX文档中的美国(用于与美国学者和期刊提交合作)。这是一本正式的数学生物学论文,几乎没有区域术语或语法:先前的工作是作为公式而不是引号给出的。
如,
Generalise
- > Generalize
Colour
- > Color
Centre
- > Centre
图中必须有基于sed
或awk
的脚本来替换大多数常见的拼写差异。
有关更多详细信息,请参阅相关的TeX论坛问题。
https://tex.stackexchange.com/questions/312138/converting-uk-to-us-spellings
n.b。我目前在Ubuntu 16.04或Elementary OS 0.3 Freya上使用kile
编译PDFLaTeX,但如果在其他地方有内置修复,我可以使用另一个TeX编译器/包。
感谢您的协助。
答案 0 :(得分:0)
我认为您需要有一个替换清单,并将其称为翻译。您必须丰富您的字典文件以有效地翻译文本文件。
sourceFile=$1
dict=$2
while read line
do
word=$(echo $line |awk '{print $1}')
updatedWord=$(grep -i $word $dict|awk '{print $2}')
sed -i "s/$word/$updatedWord/g" $sourceFile 2 > /dev/null
done < $dict
运行上述脚本,如:
./scriptName source.txt dictionary.txt
这是我使用的一个示例词典:
>cat dict
characterize characterise
prioritize prioritise
specialize specialise
analyze analyse
catalyze catalyse
size size
exercise exercise
behavior behaviour
color colour
favor favour
contour contour
center centre
fiber fibre
liter litre
parameter parameter
ameba amoeba
anesthesia anaesthesia
diarrhea diarrhoea
esophagus oesophagus
leukemia leukaemia
cesium caesium
defense defence
practice practice
license licence
defensive defensive
advice advice
aging ageing
acknowledgment acknowledgement
judgment judgement
analog analogue
dialog dialogue
fulfill fulfil
enroll enrol
skill, skillful skill, skilful
labeled labelled
signaling signalling
propelled propelled
revealing revealing
执行结果:
cat source
color of this fiber is great and we should analyze it.
./ScriptName source.txt dict.txt
cat source
colour of this fibre is great and we should analyse it.
答案 1 :(得分:0)
我认为我的awk
解决方案比sed
更灵活。
这个prg。离开LaTeX命令(当单词以&#34; \&#34;开头)时,它将保留单词的第一个大写字母。
LaTeX命令(和普通文本)的参数将被字典文件替代。
当[rev]程序的第三个参数打开时,它将通过相同的字典文件进行反转替换。
任何非alpha-beta字符都用作单词分隔符(这在LaTeX源文件中是必需的)。
prg将其输出写入屏幕(stdout),因此您需要使用重定向文件(&gt; output_f)。
(我认为你的LaTeX源的输入编码是1字节/字符。)
> cat dic.sh
#!/bin/bash
(($#<2))&& { echo "Usage $0 dictionary_file latex_file [rev]"; exit 1; }
((d= $#==3 ? 0:1))
awk -v d=$d '
BEGIN {cm=fx=0; fn="";}
fn!=FILENAME {fx++; fn=FILENAME;}
fx==1 {if(!NF)next; if(d)a[$1]=$2; else a[$2]=$1; next;} #read dict or rev dict file into an associative array
fx==2 { for(i=1; i<=length($0); i++)
{c=substr($0,i,1); #read characters from a given line of LaTeX source
if(cm){printf("%s",c); if(c~"[^A-Za-z0-9\\\]")cm=0;} #LaTeX command is occurred
else if(c~"[A-Za-z]")w=w c; else{pr(); printf("%s",c); if(c=="\\")cm=1;} #collect alpha-bets or handle them
}
pr(); printf("\n"); #handle collected last word in the line
}
function pr( s){ # print collected word or its substitution by dictionary and recreates first letter case
if(!length(w))return;
s=tolower(w);
if(!(s in a))printf("%s",w);
else printf("%s", s==w ? a[s] : toupper(substr(a[s],1,1)) substr(a[s],2));
w="";}
' $1 $2
字典文件:
> cat dictionary
apple lemon
raspberry cherry
pear banana
输入LaTeX来源:
> cat src.txt
Apple123pear,apple "pear".
\Apple123pear{raspberry}{pear}[apple].
Raspberry12Apple,pear.
执行结果:
> ./dic.sh
Usage ./dic.sh dictionary_file latex_file [rev]
> ./dic.sh dictionary src.txt >out1.txt; cat out1.txt
Lemon123banana,lemon "banana".
\Apple123pear{cherry}{banana}[lemon].
Cherry12Lemon,banana.
> ./dic.sh dictionary out1.txt >out2.txt rev; cat out2.txt
Apple123pear,apple "pear".
\Apple123pear{raspberry}{pear}[apple].
Raspberry12Apple,pear.
> diff src.txt out2.txt # they are identical