Question

我有一个日志文档，其中要拆分列的字符串只是字符\ x01（不会映射到unicode中的任何实际内容，因此它是安全的）。当我在CentOS盒子上运行以下短语“This is \ x01”时，我得到：

cat ~/temp1 | sed s/\x01/meh/
this is meh

在Mac上，我得到：

cat ~/temp1 | sed s/\x01/meh/
this is

这与尝试捕捉原始内容完全相同。

或者，在Mac上运行Perl one liner作为：

cat ~/temp1 | perl -e 'while ( my $line = <>) {$line =~ s/\x01/meh/g; print $line;}'

得到我：

this is meh

所以，到目前为止我的结论是因为某种原因，Mac上的sed讨厌unicode。任何人有任何想法为什么/如何解决它？

Answer 1

使用MacPorts包gsed中的GNU sed。

编辑：GNU sed的文档为here。