我正在尝试提取word文档,它有嵌入文件(word,excel,package)。我无法提取包并使用C#Open XML保存它。
以下代码只提取word和excel但不提取包。
using (WordprocessingDocument document = WordprocessingDocument.Open(fileName, false))
{
foreach (EmbeddedPackagePart pkgPart in document.MainDocumentPart.GetPartsOfType<EmbeddedPackagePart>())
{
if (pkgpart.uri.tostring().startswith(embeddingpartstring))
{
string filename1 = pkgpart.uri.tostring().remove(0, embeddingpartstring.length);
// get the stream from the part
system.io.stream partstream = pkgpart.getstream();
string filepath = "d:\\test\\" + filename1;
// write the steam to the file.
system.io.filestream writestream = new system.io.filestream(filepath, filemode.create, fileaccess.write);
readwritestream(pkgpart.getstream(), writestream);
}
}
}
答案 0 :(得分:0)
你遇到的问题是,当你去MainDocument.Parts并开始搜索时,你会得到的东西是“Imagepart”,“ChartPart”等等,其中ChartPart可能拥有它自己的嵌入部分,这可能是您正在寻找的Excel或Word文件。
简而言之,您需要将对嵌入部件的搜索扩展到mainDocument中的实际部件。
如果我只想从我自己的项目中提取其中一个文件中的所有嵌入部分,我会这样做。
using (var document = WordprocessingDocument.Open(@"C:\Test\myTestDocument.docx", false))
{
//just grab all the parts, might be relevant to be a bit more clever about it, depending on sizes of files and how many files you want to search through
foreach(var part in document.MainDocumentPart.Parts)
{
//foreach part see if that part containts an EmbeddedPackagePart
var testForEmbedding = part.OpenXmlPart.GetPartsOfType<EmbeddedPackagePart>();
foreach(EmbeddedPackagePart embedding in testForEmbedding)
{
//You should probably insert some clever naming scheme here..
string fileName = embedding.Uri.OriginalString.Split('/').Last();
//stream the EmbeddedPackagePart to a file
using(FileStream myFile = File.Create(@"C:\test\" + fileName))
using (var stream = embedding.GetStream())
{
stream.Seek(0, SeekOrigin.Begin);
stream.CopyTo(myFile);
myFile.Close();
}
}
}
}
我希望这有帮助!