Question

我正在尝试（编写，读取）二进制文件中的大量表格数据表（数据为Integer，Float64和ASCIIString类型，I我们毫不费力地写下它们lpad ASCIIString来制作相同长度的ASCIIString列。现在我正面临阅读操作，我想通过一次调用read函数来读取每个数据表，例如：

read(myfile,Tuple{[UInt16;[Float64 for i=1:10];UInt8]...}, dim) # => works

编辑 - ＆GT;我没有在我的真实解决方案中使用上面的代码行，因为我找到 sizeof(Tuple{Float64,Int32})!=sizeof(Float64)+sizeof(Int32)

但如何在ASCIIString类型中加入Tuple个字段？检查这个简化的例子：

file=open("./testfile.txt","w");
ts1="5char";
ts2="7 chars";
write(file,ts1,ts2);
close(file);
file=open("./testfile.txt","r");
data=read(file,typeof(ts1)); # => Errror
close(file);

Julia是对的，因为typeof(ts1)==ASCIIString和ASCIIString是一个可变长度的数组，因此Julia不知道必须读取多少字节。我必须替换哪种类型？是否有代表ConstantLangthString<length>或Bytes<length>，Chars<length>的类型？有更好的解决方案吗？

修改

我应该添加更多包含我最新进展的完整示例代码，我的最新解决方案是将一部分数据读入缓冲区（一行或更多行），为一行数据分配内存，然后reinterpret字节并将结果值从缓冲区复制到out位置：

#convert array of bits and copy them to out
function reinterpretarray!{ty}(out::Vector{ty}, buffer::Vector{UInt8}, pos::Int)
  count=length(out)
  out[1:count]=reinterpret(ty,buffer[pos:count*sizeof(ty)+pos-1])
  return count*sizeof(ty)+pos
end
file=open("./testfile.binary","w");
#generate test data 
infloat=ones(20);
instr=b"MyData";
inint=Int32[12];
#write tuple 
write(file,([infloat...],instr,inint)...);
close(file);

file=open("./testfile.binary","r");
#read data into a buffer
buffer=readbytes(file,sizeof(infloat)+sizeof(instr)+sizeof(inint));
close(file);
#allocate memory
outfloat=zeros(20)
outstr=b"123456"
outint=Int32[1]
outdata=(outfloat,outstr,outint)
#copy and convert
pos=1
for elm in outdata
  pos=reinterpretarray!(elm, buffer, pos)
end
assert(outdata==(infloat,instr,inint))

但我在C语言中的实验告诉我必须有更好，更方便，更快的解决方案，我想使用C style pointers和{{ 1}}，我不想将数据从一个位置复制到另一个位置。

由于

Answer 1

您可以使用Array{UInt8}作为ASCIIString的替代类型，这是基础数据的类型。

ts1="5chars"
print(ts1.data) #Array{UInt8}
someotherarray=ts1.data[:] #copies as new array
someotherstring=ASCIIString(somotherarray)
assert(someotherstring == ts1)

请注意我在x86_64系统中阅读UInt8，这可能不是您的情况。出于安全原因，您应该使用Array{eltype(ts1.data)}。

如何简单地读取Julia中包含字符串列的二进制数据表？

1 个答案: