我在D中编写了一个小实用程序,用于将find -print0的输出转换为printf%b格式。这样的实用程序已经存在(来自http://www.dwheeler.com/essays/filenames-in-shell.html的nul2pfb),但链接已经死了,我找不到程序,所以我决定自己在D中实现它。我使用以下zsh命令进行测试:
diff <(find) <(for i in "$(find -print0 | dd | char2code)"; do printf '%b\n' "$i"; done)
预期输出为空,但我发现包含连字符的某些文件名处理错误。
我的源代码是:
import std.stdio;
import std.conv;
import std.ascii;
import std.c.stdlib;
void main()
{
foreach (ubyte[] mybuff; chunks(stdin, 4096)) {
encodeline (mybuff);
}
}
@safe void encodeline (ubyte[] mybuff) {
// char[] outstring;
foreach (ubyte i; mybuff) {
char b = to!char(i);
switch (i) {
case 'a': .. case 'z':
case 'A': .. case 'Z':
case '0': .. case '9':
case '/':
case '.':
case '_':
case ':': writeChar(b); break;
default: writeOctal(b); break;
case 0: writeChar ('\n'); break;
case '\\': writeString(`\\`); break;
case '\t': writeString(`\t`); break;
case '\n': writeString(`\n`); break;
case '\r': writeString(`\r`); break;
case '\f': writeString(`\f`); break;
case '\v': writeString(`\v`); break;
case '\a': writeString(`\a`); break;
case '\b': writeString(`\b`); break;
}
}
// writeString (outstring);
}
@trusted void writeString (string a)
{
write (a);
}
@trusted void writeOctal (int a)
{
try
{
writef ("\\%.#o", a); // leading 0 needed for for zsh printf '%b'
}
catch (std.format.FormatException b)
{
write ("Format exception in function writeOctal");
throw b;
}
}
@trusted void writeChar (char a)
{
try
{
write (a);
}
catch (std.format.FormatException b)
{
write ("Format exception in function writeChar");
throw b;
}
}
@trusted void writeNewline ()
{
writeln;
}
这是diff输出的一部分:
Correct filenames:
./.ibam/profile-004-battery
./.ibam/profile-034-charge
./.ibam/profile-054-charge
./.ibam/profile-045-battery
(a bunch of lines skipped)
---
Wrong filenames:
./.ibam/profileh04-battery
./.ibam/profileh34-charge
./.ibam/profileh54-charge
./.ibam/profileh45-battery
似乎-0正被h替换。
更新:在writef的转换说明符中替换。#with .4修复了问题,但我仍然看到通过/ proc搜索输出的差异。但是,当我这样做时(以root身份)也是如此
diff <(find / print) <(find / print) > (myhomedirectory)/log.txt
所以我选择忽略它。
答案 0 :(得分:1)
事实证明,问题是每当八进制转义序列后面跟着一个0到7之间的数字,并且原始转义序列不是4个八进制数字长,这个数字就成了八进制转义序列的一部分,导致printf%b产生不正确的输出。这是通过在D程序中替换转换说明符中的#by .4来解决的。
这是更正的源代码,删除了一些不需要的功能:
import std.stdio;
import std.conv;
import std.ascii;
import std.c.stdlib;
void main()
{
foreach (ubyte[] mybuff; chunks(stdin, 4096)) {
encodeline (mybuff);
}
}
@safe void encodeline (ubyte[] mybuff) {
// char[] outstring;
foreach (ubyte i; mybuff) {
char b = to!char(i);
switch (i) {
case 'a': .. case 'z':
case 'A': .. case 'Z':
case '0': .. case '9':
case '/':
case '.':
case '_':
case ':': writeChar(b); break;
default: writeOctal(b); break;
case 0: writeChar ('\n'); break;
case '\\': writeString(`\\`); break;
case '\t': writeString(`\t`); break;
case '\n': writeString(`\n`); break;
case '\r': writeString(`\r`); break;
case '\f': writeString(`\f`); break;
case '\v': writeString(`\v`); break;
case '\a': writeString(`\a`); break;
case '\b': writeString(`\b`); break;
}
}
}
@trusted void writeString (string a)
{
write (a);
}
@trusted void writeOctal (int a)
{
writef ("\\%.4o", a); // leading 0 needed for for zsh printf '%b'
}
@trusted void writeChar (char a)
{
write (a);
}