我从http://www.dwheeler.com/essays/filenames-in-shell.html写了一个nul2pfb实用程序的D实现,因为源代码的链接坏了,我想尝试学习D.我注意到它很慢(几乎无法跟上找到传递数据的-print0,它应该快得多,因为它不需要在任何接近尽可能多的系统调用的地方进行。)
第一个实现正常工作(使用zsh和bash printf内置函数以及/ usr / bin / printf进行测试)。第二个,虽然快得多(可能是由于对write()的调用要少得多),多次重复输出的第一部分,并且无法输出其输出的剩余部分。是什么造成了这种差异?我是D的新手并且不明白。
工作代码:
import std.stdio;
import std.conv;
void main()
{
foreach (ubyte[] mybuff; chunks(stdin, 4096)) {
encodebuff (mybuff);
}
}
@safe void encodebuff (ubyte[] mybuff) {
foreach (ubyte i; mybuff) {
char b = to!char(i);
switch (i) {
case 'a': .. case 'z':
case 'A': .. case 'Z':
case '0': .. case '9':
case '/':
case '.':
case '_':
case ':': writeChar(b); break;
default: writeOctal(b); break;
case 0: writeChar ('\n'); break;
case '\\': writeString(`\\`); break;
case '\t': writeString(`\t`); break;
case '\n': writeString(`\n`); break;
case '\r': writeString(`\r`); break;
case '\f': writeString(`\f`); break;
case '\v': writeString(`\v`); break;
case '\a': writeString(`\a`); break;
case '\b': writeString(`\b`); break;
}
}
}
@trusted void writeString (string a)
{
write (a);
}
@trusted void writeOctal (int a)
{
writef ("\\%.4o", a); // leading 0 needed for for zsh printf '%b'
}
@trusted void writeChar (char a)
{
write (a);
}
破碎的版本:
import std.stdio;
import std.conv;
import std.string;
void main()
{
foreach (ubyte[] mybuff; chunks(stdin, 4096)) {
encodebuff (mybuff);
}
}
@safe void encodebuff (ubyte[] mybuff) {
char[] outstring;
foreach (ubyte i; mybuff) {
switch (i) {
case 'a': .. case 'z':
case 'A': .. case 'Z':
case '0': .. case '9':
case '/':
case '.':
case '_':
case ':': outstring ~= to!char(i); break;
case 0: outstring ~= '\n'; break;
default: char[5] mystring;
formatOctal(mystring, i);
outstring ~= mystring;
break;
case '\\': outstring ~= `\\`; break;
case '\t': outstring ~= `\t`; break;
case '\n': outstring ~= `\n`; break;
case '\r': outstring ~= `\r`; break;
case '\f': outstring ~= `\f`; break;
case '\v': outstring ~= `\v`; break;
case '\a': outstring ~= `\a`; break;
case '\b': outstring ~= `\b`; break;
}
writeString (outstring);
}
}
@trusted void writeString (char[] a)
{
write (a);
}
@trusted void formatOctal (char[] b, ubyte a)
{
sformat (b, "\\%.4o", a); // leading 0 needed for zsh printf '%b'
}
测试:(请注意,filelist是由我的主目录上的find -print0生成的NUL分隔的文件列表,filelist2.txt是从filelist.txt生成的文件列表中的sed -e / \ x0 / \ n / g'> filelist2.txt,因此是换行符分隔的文件名的相应列表。)
# the sed script escapes the backslashes so xargs does not clobber them
diff filelist2.txt <(<filelist.txt char2code2 | sed -e 's/\\/\\\\/g' | xargs /usr/bin/printf "%b\n")
# from within zsh
bash -c 'diff filelist2.txt <(for i in "$(<filelist.txt char2code)"; do printf "%b\n" "$i"; done)'
# from within zsh and bash
diff filelist.txt <(for i in $(char2code <filelist.txt); do printf '%b\0' "$i"; done)
# from within zsh, bash, and dash
for i in $(char2code <filelist.txt); do printf '%b\0' "$i"; done | diff - filelist.txt
我作为酸测试制作的剧本:
#!/bin/bash
# this creates a completely random list of NUL-delimited strings
a=''
trap 'rm -f "$a"' EXIT
a="$(mktemp)";
</dev/urandom sed -e 's/\x0\x0/\x0/g' | dd count=2048 of="$a"
test -s "$a" || exit 1
printf '\0' >> "$a"
for i in $("$@" < "$a")
do
printf '%b\0' "$i"
done | diff - "$a"
差异的原因是什么?
编辑:我已经实施了@yaz和@MichalMinich建议的更改,但仍然看到错误的结果。具体来说,找-print0 |我的主目录中的char2code2(程序的名称,在我的$ PATH中)导致退出状态为1且没有输出。但是,它从子目录中工作的项目少得多。我修改后的来源如下:import std.stdio;
import std.conv;
import std.format;
import std.array;
void main()
{
foreach (ubyte[] mybuff; chunks(stdin, 4096)) {
encodebuff (mybuff);
}
writeln();
}
void encodebuff (ubyte[] mybuff) {
auto buffer = appender!string();
foreach (ubyte i; mybuff) {
switch (i) {
case 'a': .. case 'z':
case 'A': .. case 'Z':
case '0': .. case '9':
case '/':
case '.':
case '_':
case ':': buffer.put(to!char(i)); break;
case 0: buffer.put('\n'); break;
default: formatOctal(buffer, i); break;
case '\\': buffer.put(`\\`); break;
case '\t': buffer.put(`\t`); break;
case '\n': buffer.put(`\n`); break;
case '\r': buffer.put(`\r`); break;
case '\f': buffer.put(`\f`); break;
case '\v': buffer.put(`\v`); break;
case '\a': buffer.put(`\a`); break;
case '\b': buffer.put(`\b`); break;
}
}
writeString (buffer.data);
// writef(stderr, "Wrote a line\n");
}
@trusted void writeString (string a)
{
write (a);
}
@trusted void formatOctal(Writer)(Writer w, ubyte a)
{
formattedWrite(w, "\\%.4o", a); // leading 0 needed for zsh printf '%b'
}
答案 0 :(得分:3)
您需要在writeString
的{{1}}之外取foreach
。目前,您正在为每个循环编写encodebuff
而不清除它。 @Michal Minich指出的问题也是有效的。
答案 1 :(得分:2)
一个原因可能是你总是附加5个char[5] mystring
字符。 sformat
中的函数formatOctal
返回格式化的字符串,该字符串可能少于5个字符(可能是缓冲区的片段),您应该使用该字符串追加到字符串。
性能建议:在构建字符串时使用Appender代替〜=以获得更好的性能。
答案 2 :(得分:1)
真正的问题是我的LDC安装。它使用了共享库,其druntime版本不支持。
将LDC重新编译为使用过的静态库解决了这个问题。