我正在尝试解析一个冗长的文件并删除我不想要的部分。从研究看来,OpenXml SDK是操作和搜索单词doc的最简单参考。不幸的是,它并不总是一致的,因为我在尝试将节点分配给运行对象时不断获得NullReferenceExceptions
。本质上,我的程序应该通过docx文件找到标签(ver 1),然后删除它与结束标签(/ ver 1)之间的所有内容。这似乎仅适用于某些部分,因为其他部分我得到了NullReferenceException
,我觉得它与MS Word使用的混乱格式有关,但我不知道。
以下是特定部分的代码,如果有人可以提供帮助我会感激它。
IEnumerable<OpenXmlElement> elem = main.Document.Body.Descendants().ToList();
foreach (OpenXmlElement elems in elem)
{
if (elems is Text && elems.InnerText == s_Ver1)// s_Ver1 = "(Ver 1)"
{
Run run = (Run)elems.Parent;
Paragraph p = (Paragraph)run.Parent;
p.RemoveAllChildren();
p.Remove();
foreach (OpenXmlElement endelems in elem)
{
if (endelems is Text && elems.InnerText == e_Ver1)//e_Ver1 = "(/Ver1)"
{
run = (Run)endelems.Parent;
p = (Paragraph)run.Parent;
p.Remove();
break;
}
else
{
Run d_Run = (Run)endelems.Parent;
Paragraph d_p = (Paragraph)d_Run.Parent;
d_p.RemoveAllChildren();
d_p.Remove();*/
try
{
endelems.Remove();
}
catch(Exception err)
{
MessageBox.Show(err.ToString());
}
}
}
}
}
修改
尝试在代码中捕获(在endelems.remove()附近)
System.InvalidOperationException: The Parent of this element is Null
//it also says line 141 but I'm not sure how to get line numbering in vs2010
尝试抓住整个事情的错误
System.NullReferenceException: Object reference not set to an instance of an object
//line 114 which would be Paragraph p = (Paragraph)run.Parent; line
答案 0 :(得分:1)
我不太确定你在这里想做什么,但是......
你会得到身体中的儿童的静态列表。
迭代可能已删除的子项。然后调用删除已使用RemoveAllChildren()
删除的子项。
更不用说这个错误的逻辑了。
if (endelems is Text && elems.InnerText == e_Ver1)//e_Ver1 = "(/Ver1)"
{
...
else
{
Run d_Run = (Run)endelems.Parent;
}
在else子句中,endelems可能没有父Run
,因为它可能不是Text
元素。
---编辑---伪代码
IEnumerable<Text> elems = wd.MainDocumentPart.Document.Body.Descendants<Text>();
foreach (Text elem in elems)
{
if(elem.InnerText.Equals("Ver 1"))
{
IEnumerable<OpenXmlElement> afterelems = elem.ElementsAfter();
foreach(OpenXmlElement openelem in afterelems)
{
if(openelem is Text && ((Text)openelem).InnerText.Equals("Ver 2"))
{
break;
} else if(openelem is Text) {
openelem.Remove();
}
}
break;
}
}
foreach (Run run in wd.MainDocumentPart.Document.Body.Descendants<Run>().Where(run => run.Descendants<Text>().Count() == 0 && run.Descendants<Break>().Count() == 0))
{
run.Remove();
}
foreach (Paragraph par in wd.MainDocumentPart.Document.Body.Descendants<Paragraph>().Where(par => par.Descendants<Run>().Count() == 0 && par.Descendants<Table>().Count() == 0))
{
par.Remove();
}