在读取STL文件格式的specs之后,我想编写一些测试来确保文件实际上是有效的二进制文件或ASCII文件。
可以通过在字节0处找到文本“ solid ”,然后是空格(十六进制值\x20
),然后是可选的文本字符串来确定基于ASCII的STL文件,然后换行。
二进制STL文件具有保留的 80 字节标头,后跟 4 字节无符号整数( NumberOfTriangles ),然后指定的每个 NumberOfTriangles 方面的数据为50 字节。
每个三角形面的长度 50 :12个单精度(4字节)浮点数,后跟无符号短(2字节)无符号整数。
如果二进制文件的长度恰好为 84 + NumberOfTriangles * 50 ,则通常可将其视为有效的二进制文件。
不幸的是,二进制文件可以包含从80字节标头内容的字节0开始的文本“ solid ”。因此,仅对该关键字进行测试不能确定文件是ASCII还是二进制文件。
这是我到目前为止所做的:
STL_STATUS getStlFileFormat(const QString &path)
{
// Each facet contains:
// - Normals: 3 floats (4 bytes)
// - Vertices: 3x floats (4 bytes each, 12 bytes total)
// - AttributeCount: 1 short (2 bytes)
// Total: 50 bytes per facet
const size_t facetSize = 3*sizeof(float_t) + 3*3*sizeof(float_t) + sizeof(uint16_t);
QFile file(path);
if (!file.open(QIODevice::ReadOnly))
{
qDebug("\n\tUnable to open \"%s\"", qPrintable(path));
return STL_INVALID;
}
QFileInfo fileInfo(path);
size_t fileSize = fileInfo.size();
if (fileSize < 84)
{
// 80-byte header + 4-byte "number of triangles" marker
qDebug("\n\tThe STL file is not long enough (%u bytes).", uint(fileSize));
return STL_INVALID;
}
// Look for text "solid" in first 5 bytes, indicating the possibility that this is an ASCII STL format.
QByteArray fiveBytes = file.read(5);
// Header is from bytes 0-79; numTriangleBytes starts at byte offset 80.
if (!file.seek(80))
{
qDebug("\n\tCannot seek to the 80th byte (after the header)");
return STL_INVALID;
}
// Read the number of triangles, uint32_t (4 bytes), little-endian
QByteArray nTrianglesBytes = file.read(4);
file.close();
uint32_t nTriangles = *((uint32_t*)nTrianglesBytes.data());
// Verify that file size equals the sum of header + nTriangles value + all triangles
size_t targetSize = 84 + nTriangles * facetSize;
if (fileSize == targetSize)
{
return STL_BINARY;
}
else if (fiveBytes.contains("solid"))
{
return STL_ASCII;
}
else
{
return STL_INVALID;
}
}
到目前为止,这对我有用,但我担心纯ASCII文件的第80个字节可能包含一些ASCII字符,当转换为uint32_t时,实际上可能等于文件的长度(非常不可能,但是并非不可能)。
是否有额外的步骤可以证明我是否可以“绝对确定”文件是ASCII还是二进制文件?
更新:
根据@Powerswitch和@RemyLebeau的建议,我正在进一步测试关键字。这就是我现在所拥有的:
STL_STATUS getStlFileFormat(const QString &path)
{
// Each facet contains:
// - Normals: 3 floats (4 bytes)
// - Vertices: 3x floats (4 byte each, 12 bytes total)
// - AttributeCount: 1 short (2 bytes)
// Total: 50 bytes per facet
const size_t facetSize = 3*sizeof(float_t) + 3*3*sizeof(float_t) + sizeof(uint16_t);
QFile file(path);
bool canFileBeOpened = file.open(QIODevice::ReadOnly);
if (!canFileBeOpened)
{
qDebug("\n\tUnable to open \"%s\"", qPrintable(path));
return STL_INVALID;
}
QFileInfo fileInfo(path);
size_t fileSize = fileInfo.size();
// The minimum size of an empty ASCII file is 15 bytes.
if (fileSize < 15)
{
// "solid " and "endsolid " markers for an ASCII file
qDebug("\n\tThe STL file is not long enough (%u bytes).", uint(fileSize));
file.close();
return STL_INVALID;
}
// Binary files should never start with "solid ", but just in case, check for ASCII, and if not valid
// then check for binary...
// Look for text "solid " in first 6 bytes, indicating the possibility that this is an ASCII STL format.
QByteArray sixBytes = file.read(6);
if (sixBytes.startsWith("solid "))
{
QString line;
QTextStream in(&file);
while (!in.atEnd())
{
line = in.readLine();
if (line.contains("endsolid"))
{
file.close();
return STL_ASCII;
}
}
}
// Wasn't an ASCII file. Reset and check for binary.
if (!file.reset())
{
qDebug("\n\tCannot seek to the 0th byte (before the header)");
file.close();
return STL_INVALID;
}
// 80-byte header + 4-byte "number of triangles" for a binary file
if (fileSize < 84)
{
qDebug("\n\tThe STL file is not long enough (%u bytes).", uint(fileSize));
file.close();
return STL_INVALID;
}
// Header is from bytes 0-79; numTriangleBytes starts at byte offset 80.
if (!file.seek(80))
{
qDebug("\n\tCannot seek to the 80th byte (after the header)");
file.close();
return STL_INVALID;
}
// Read the number of triangles, uint32_t (4 bytes), little-endian
QByteArray nTrianglesBytes = file.read(4);
if (nTrianglesBytes.size() != 4)
{
qDebug("\n\tCannot read the number of triangles (after the header)");
file.close();
return STL_INVALID;
}
uint32_t nTriangles = *((uint32_t*)nTrianglesBytes.data());
// Verify that file size equals the sum of header + nTriangles value + all triangles
if (fileSize == (84 + (nTriangles * facetSize)))
{
file.close();
return STL_BINARY;
}
return STL_INVALID;
}
它似乎处理更多边缘情况,我试图以一种优雅地处理极大(几千兆字节)STL文件的方式编写它,而不需要一次将ENTIRE文件加载到内存中以便扫描对于“endsolid”文本。
随意提供任何反馈和建议(特别是对于将来寻找解决方案的人)。
答案 0 :(得分:8)
如果文件不以"solid "
开头,并且文件大小正好是84 + (numTriangles * 50)
字节,其中从偏移量80读取numTriangles
,那么该文件是二进制文件。
如果文件大小至少为15个字节(没有三角形的ASCII文件的绝对最小值)并以"solid "
开头,请阅读其后的名称,直到换行为止到达。检查下一行是以"facet "
开头还是"endsolid [name]"
(不允许其他值)。如果是"facet "
,请搜索文件的末尾,并确保其以"endsolid [name]"
行结束。如果所有这些都为真,则该文件为ASCII。
将任何其他组合视为无效。
所以,像这样:
STL_STATUS getStlFileFormat(const QString &path)
{
QFile file(path);
if (!file.open(QIODevice::ReadOnly))
{
qDebug("\n\tUnable to open \"%s\"", qPrintable(path));
return STL_INVALID;
}
QFileInfo fileInfo(path);
size_t fileSize = fileInfo.size();
// Look for text "solid " in first 6 bytes, indicating the possibility that this is an ASCII STL format.
if (fileSize < 15)
{
// "solid " and "endsolid " markers for an ASCII file
qDebug("\n\tThe STL file is not long enough (%u bytes).", uint(fileSize));
return STL_INVALID;
}
// binary files should never start with "solid ", but
// just in case, check for ASCII, and if not valid then
// check for binary...
QByteArray sixBytes = file.read(6);
if (sixBytes.startsWith("solid "))
{
QByteArray name = file.readLine();
QByteArray endLine = name.prepend("endsolid ");
QByteArray nextLine = file.readLine();
if (line.startsWith("facet "))
{
// TODO: seek to the end of the file, read the last line,
// and make sure it is "endsolid [name]"...
/*
line = ...;
if (!line.startsWith(endLine))
return STL_INVALID;
*/
return STL_ASCII;
}
if (line.startsWith(endLine))
return STL_ASCII;
// reset and check for binary...
if (!file.reset())
{
qDebug("\n\tCannot seek to the 0th byte (before the header)");
return STL_INVALID;
}
}
if (fileSize < 84)
{
// 80-byte header + 4-byte "number of triangles" for a binary file
qDebug("\n\tThe STL file is not long enough (%u bytes).", uint(fileSize));
return STL_INVALID;
}
// Header is from bytes 0-79; numTriangleBytes starts at byte offset 80.
if (!file.seek(80))
{
qDebug("\n\tCannot seek to the 80th byte (after the header)");
return STL_INVALID;
}
// Read the number of triangles, uint32_t (4 bytes), little-endian
QByteArray nTrianglesBytes = file.read(4);
if (nTrianglesBytes.size() != 4)
{
qDebug("\n\tCannot read the number of triangles (after the header)");
return STL_INVALID;
}
uint32_t nTriangles = *((uint32_t*)nTrianglesBytes.data());
// Verify that file size equals the sum of header + nTriangles value + all triangles
if (fileSize == (84 + (nTriangles * 50)))
return STL_BINARY;
return STL_INVALID;
}
答案 1 :(得分:4)
是否有额外的步骤可以证明我是否可以确保&#34;绝对确定&#34;文件是ASCII还是二进制文件?
由于stl规范中没有格式标记,因此您无法完全确定文件格式。
检查&#34;固体&#34;在大多数情况下,在文件的开头应该足够了。此外,您还可以查看更多关键字,例如&#34; facet&#34;或&#34;顶点&#34;确定它的ASCII码。这些单词只应以ASCII格式(或无用的二进制标题)出现,但二进制浮点数偶然形成这些单词的可能性很小。因此,您还可以检查关键字的顺序是否正确。
当然检查二进制标题中的长度是否与文件长度匹配。
但是:如果你读取文件是线性的,那么你的代码可以更快地运行,并希望没有人把这些文字放在一起,并且#34; solid&#34;在二进制标题中。如果文件以&#34; solid&#34;开头,你可能更喜欢ASCII解析。如果ASCII解析失败,则使用二进制解析器作为后备。