我使用NPOI库来读取xlsx和xls文件。
我有这段代码:
IWorkbook workBook = null;
string fileExtension = Path.GetExtension(path);
using (var fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
if (fileExtension == ".xls")
workBook = new HSSFWorkbook(fs);
else if (fileExtension == ".xlsx")
workBook = new XSSFWorkbook(fs);
}
这是完美的工作。
但path
excel文件的问题并不总是在他的名字中有扩展名(.xls或.xlsx)。
因此,我需要检查fs
或HSSFWorkbook()
XSSFWorkbook()
套件是否class Vector(object):
vec = []
def __init__(self, l):
self.vec = l
def dim():
return len(self.vec)
def __getitem__(self, i):
return self.vec[i - 1]
def __setitem__(self, i, x):
self.vec[i - 1] = x
def __str__(self):
s = 'Vector: ['
for i in range(0, len(self.vec)):
s = s + str(self.vec[i])
if i < len(self.vec) - 1:
s = s + ', '
s = s + ']'
return s
def __add__(self, other):
assert(type(other) == Vector)
v = self.vec
for i in range(0, len(v)):
v[i]=v[i] + other[i+1]
x = Vector(v)
return x
def __mul__(self, other):
if type(other) == type(self):
v = self.vec
for i in range(0, len(v)):
v[i]=v[i]*other[i+1]
x = Vector(v)
return sum(x)
elif type(other) == type(1) or type(other) == type(1.0):
v = self.vec
for i in range(0, len(v)):
v[i] = v[i] *other
x = Vector(v)
return x
def __rmul__(self, other):
return self.__mul__(other)
任何想法如何在没有文件扩展名的情况下检查它?
答案 0 :(得分:2)
IWorkbook workBook = null;
string fileExtension = Path.GetExtension(path);
using (var fs = new FileStream(path, FileMode.Open, FileAccess.Read))
{
workBook = WorkbookFactory.Create(fs);
}
WorkbookFactory.Create()方法根据从xls或xlsx文件构建的fileStreem参数构造IWorkbook。
答案 1 :(得分:0)
应用https://en.wikipedia.org/wiki/List_of_file_signatures的文件标题信息,我们可以使用以下内容:
public static class FormatRecognizer
{
public static Boolean IsZipFile(Stream stream)
{
if (stream == null)
throw new ArgumentNullException(paramName: nameof(stream));
var zipHeader = new Byte[]
{
0x50, 0x4B, 0x03, 0x04
};
var streamBytes = GetBytesAndRestore(stream, zipHeader.Length);
return streamBytes.SequenceEqual(zipHeader);
}
public static Boolean IsOffice2003File(Stream stream)
{
if (stream == null)
throw new ArgumentNullException(paramName: nameof(stream));
var officeHeader = new Byte[]
{
0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1,
};
var streamBytes = GetBytesAndRestore(stream, officeHeader.Length);
return streamBytes.SequenceEqual(officeHeader);
}
private static IEnumerable<Byte> GetBytesAndRestore(Stream stream, Int32 bytesCount)
{
if (stream == null)
throw new ArgumentNullException(paramName: nameof(stream));
var position = stream.Position;
try
{
using (var reader = new BinaryReader(stream, Encoding.Default, leaveOpen: true))
{
return reader.ReadBytes(bytesCount);
}
}
finally
{
stream.Position = position;
}
}
}
...
private static void PrintFormatInfo(String path)
{
Console.WriteLine("File at '{0}'", path);
using (var stream = File.Open(path, FileMode.Open))
{
PrintFormatInfo(stream);
}
}
private static void PrintFormatInfo(Stream stream)
{
Console.WriteLine("Is office 2003 = {0}", FormatRecognizer.IsOffice2003File(stream));
Console.WriteLine("Is zip file (possibly xlsx) = {0}", FormatRecognizer.IsZipFile(stream));
}
...
PrintFormatInfo("1.txt");
PrintFormatInfo("1.xls");
PrintFormatInfo("1.xlsx");
这不是绝对可靠的,因为IsZipFile
对于简单的zip存档会返回true,而IsOffice2003File
也会对doc,ppt等成功。
但这是我能想到的最简单的解决方案。任何更正确的东西都需要更深入的文件格式知识,这可能是您需要的,也可能不是。