使用Open XML SDK替换Word文件中的书签文本

时间:2010-07-22 11:27:02

标签: c# ms-word openxml openxml-sdk

我认为v2.0更好......他们有一些不错的“如何:......”examples但是书签似乎没有像表那样明显...书签是由两个 XML元素BookmarkStart& BookmarkEnd。我们有一些带有书签的模板作为书签,我们只想用其他文本替换书签...没有奇怪的格式化,但如何选择/替换书签文字?

11 个答案:

答案 0 :(得分:16)

答案 1 :(得分:6)

用单个内容(可能是多个文本块)替换书签。

public static void InsertIntoBookmark(BookmarkStart bookmarkStart, string text)
{
    OpenXmlElement elem = bookmarkStart.NextSibling();

    while (elem != null && !(elem is BookmarkEnd))
    {
        OpenXmlElement nextElem = elem.NextSibling();
        elem.Remove();
        elem = nextElem;
    }

    bookmarkStart.Parent.InsertAfter<Run>(new Run(new Text(text)), bookmarkStart);
}

首先,删除开始和结束之间的现有内容。然后在开始之后(结束之前)直接添加新的运行。

但是,不确定书签在打开时是在另一个部分中关闭还是在不同的表格单元格中等等。

对我而言,现在已经足够了。

答案 2 :(得分:4)

我刚刚在10分钟前发现了这一点,请原谅代码的hackish性质。

首先我编写了一个辅助递归帮助函数来查找所有书签:

private static Dictionary<string, BookmarkEnd> FindBookmarks(OpenXmlElement documentPart, Dictionary<string, BookmarkEnd> results = null, Dictionary<string, string> unmatched = null )
{
    results = results ?? new Dictionary<string, BookmarkEnd>();
    unmatched = unmatched ?? new Dictionary<string,string>();

    foreach (var child in documentPart.Elements())
    {
        if (child is BookmarkStart)
        {
            var bStart = child as BookmarkStart;
            unmatched.Add(bStart.Id, bStart.Name);
        }

        if (child is BookmarkEnd)
        {
            var bEnd = child as BookmarkEnd;
            foreach (var orphanName in unmatched)
            {
                if (bEnd.Id == orphanName.Key)
                    results.Add(orphanName.Value, bEnd);
            }
        }

        FindBookmarks(child, results, unmatched);
    }

    return results;
}

这会返回一个字典,我可以用它来分隔我的替换列表并在书签后面添加文字:

var bookMarks = FindBookmarks(doc.MainDocumentPart.Document);

foreach( var end in bookMarks )
{
    var textElement = new Text("asdfasdf");
    var runElement = new Run(textElement);

    end.Value.InsertAfterSelf(runElement);
}

从我所知道的插入和替换书签看起来更难。当我使用InsertAt而不是InsertIntoSelf时,我得到:“非复合元素没有子元素。” YMMV

答案 3 :(得分:3)

经过很长时间,我写了这个方法:

    Public static void ReplaceBookmarkParagraphs(WordprocessingDocument doc, string bookmark, string text)
    {
        //Find all Paragraph with 'BookmarkStart' 
        var t = (from el in doc.MainDocumentPart.RootElement.Descendants<BookmarkStart>()
                 where (el.Name == bookmark) &&
                 (el.NextSibling<Run>() != null)
                 select el).First();
        //Take ID value
        var val = t.Id.Value;
        //Find the next sibling 'text'
        OpenXmlElement next = t.NextSibling<Run>();
        //Set text value
        next.GetFirstChild<Text>().Text = text;

        //Delete all bookmarkEnd node, until the same ID
        deleteElement(next.GetFirstChild<Text>().Parent, next.GetFirstChild<Text>().NextSibling(), val, true);
    }

之后,我打电话给:

Public static bool deleteElement(OpenXmlElement parentElement, OpenXmlElement elem, string id, bool seekParent)
{
    bool found = false;

    //Loop until I find BookmarkEnd or null element
    while (!found && elem != null && (!(elem is BookmarkEnd) || (((BookmarkEnd)elem).Id.Value != id)))
    {
        if (elem.ChildElements != null && elem.ChildElements.Count > 0)
        {
            found = deleteElement(elem, elem.FirstChild, id, false);
        }

        if (!found)
        {
            OpenXmlElement nextElem = elem.NextSibling();
            elem.Remove();
            elem = nextElem;
        }
    }

    if (!found)
    {
        if (elem == null)
        {
            if (!(parentElement is Body) && seekParent)
            {
                //Try to find bookmarkEnd in Sibling nodes
                found = deleteElement(parentElement.Parent, parentElement.NextSibling(), id, true);
            }
        }
        else
        {
            if (elem is BookmarkEnd && ((BookmarkEnd)elem).Id.Value == id)
            {
                found = true;
            }
        }
    }

    return found;
}

如果您没有空书签,此代码可以正常运行。 我希望它可以帮助别人。

答案 4 :(得分:2)

此处的大多数解决方案都假定在运行之前和结束之后开始的常规书签模式,这并非总是如此,例如如果书签在一个段落或表格中开始,并在另一个段落中的某个地方结束(就像其他人注意到的那样)如何使用文档顺序来处理书签没有放在常规结构中的情况 - 文档顺序仍然会找到之间可以替换的所有相关文本节点。只需执行root.DescendantNodes()。其中​​(xtext或bookmarkstart或书签结束)将以文档顺序遍历,然后可以替换在看到书签起始节点之后但在查看结束节点之前出现的文本节点。

答案 5 :(得分:1)

以下是我如何操作以及VB在bookmarkStart和BookmarkEnd之间添加/替换文本。

<w:bookmarkStart w:name="forbund_kort" w:id="0" /> 
        - <w:r>
          <w:t>forbund_kort</w:t> 
          </w:r>
<w:bookmarkEnd w:id="0" />


Imports DocumentFormat.OpenXml.Packaging
Imports DocumentFormat.OpenXml.Wordprocessing

    Public Class PPWordDocx

        Public Sub ChangeBookmarks(ByVal path As String)
            Try
                Dim doc As WordprocessingDocument = WordprocessingDocument.Open(path, True)
                 'Read the entire document contents using the GetStream method:

                Dim bookmarkMap As IDictionary(Of String, BookmarkStart) = New Dictionary(Of String, BookmarkStart)()
                Dim bs As BookmarkStart
                For Each bs In doc.MainDocumentPart.RootElement.Descendants(Of BookmarkStart)()
                    bookmarkMap(bs.Name) = bs
                Next
                For Each bs In bookmarkMap.Values
                    Dim bsText As DocumentFormat.OpenXml.OpenXmlElement = bs.NextSibling
                    If Not bsText Is Nothing Then
                        If TypeOf bsText Is BookmarkEnd Then
                            'Add Text element after start bookmark
                            bs.Parent.InsertAfter(New Run(New Text(bs.Name)), bs)
                        Else
                            'Change Bookmark Text
                            If TypeOf bsText Is Run Then
                                If bsText.GetFirstChild(Of Text)() Is Nothing Then
                                    bsText.InsertAt(New Text(bs.Name), 0)
                                End If
                                bsText.GetFirstChild(Of Text)().Text = bs.Name
                            End If
                        End If

                    End If
                Next
                doc.MainDocumentPart.RootElement.Save()
                doc.Close()
            Catch ex As Exception
                Throw ex
            End Try
        End Sub

    End Class

答案 6 :(得分:1)

我从答案中获取了代码,并且在特殊情况下遇到了一些问题:

  1. 您可能希望忽略隐藏的书签。如果名称以_(下划线)
  2. 开头,则会隐藏书签
  3. 如果书签是针对一个以上的TableCell,您将在BookmarkStart中的第一个Cell中找到它,其属性ColumnFirst引用书签开始的单元格的从0开始的列索引。 ColumnLast是指书签结束的单元格,对于我的特殊情况,它始终是ColumnFirst == ColumnLast(书签只标记一列)。在这种情况下,您也找不到BookmarkEnd。
  4. 书签可以为空,因此BookmarkStart会直接跟随BookmarkEnd,在这种情况下您只需调用 bookmarkStart.Parent.InsertAfter(new Run(new Text("Hello World")), bookmarkStart)
  5. 书签也可以包含许多文本元素,因此您可能希望删除所有其他元素,否则可能会替换部分书签,而其他部分将保留。
  6. 而且我不确定我的最后一次黑客行为是否必要,因为我不知道OpenXML的所有限制,但在发现之前的4次之后,我也不再相信会有一个兄弟的Run ,带着文字的孩子。所以我只是看看我所有的兄弟姐妹(直到BookmarEnd与BookmarkStart具有相同的ID)并检查所有孩子,直到找到任何文本。 - 如果有必要,可能会有更多使用OpenXML的人回答吗?
  7. 您可以查看我的具体实施here

    希望这可以帮助一些遇到同样问题的人。

答案 7 :(得分:1)

我需要用表格替换书签的文本(书签名称是“表格”)。这是我的方法:

public void ReplaceBookmark( DatasetToTable( ds ) )
{
    MainDocumentPart mainPart = myDoc.MainDocumentPart;
    Body body = mainPart.Document.GetFirstChild<Body>();
    var bookmark = body.Descendants<BookmarkStart>()
                        .Where( o => o.Name == "Table" )
                        .FirstOrDefault();
    var parent = bookmark.Parent; //bookmark's parent element
    if (ds!=null)
    {
        parent.InsertAfterSelf( DatasetToTable( ds ) );
        parent.Remove();
    }
    mainPart.Document.Save();
}


public Table DatasetToTable( DataSet ds )
{
    Table table = new Table();
    //creating table;
    return table;
}

希望这有帮助

答案 8 :(得分:0)

以下是我在VB.NET中的表现:

For Each curBookMark In contractBookMarkStarts

      ''# Get the "Run" immediately following the bookmark and then
      ''# get the Run's "Text" field
      runAfterBookmark = curBookMark.NextSibling(Of Wordprocessing.Run)()
      textInRun = runAfterBookmark.LastChild

      ''# Decode the bookmark to a contract attribute
      lines = DecodeContractDataToContractDocFields(curBookMark.Name, curContract).Split(vbCrLf)

      ''# If there are multiple lines returned then some work needs to be done to create
      ''# the necessary Run/Text fields to hold lines 2 thru n.  If just one line then set the
      ''# Text field to the attribute from the contract
      For ptr = 0 To lines.Count - 1
          line = lines(ptr)
          If ptr = 0 Then
              textInRun.Text = line.Trim()
          Else
              ''# Add a <br> run/text component then add next line
              newRunForLf = New Run(runAfterBookmark.OuterXml)
              newRunForLf.LastChild.Remove()
              newBreak = New Break()
              newRunForLf.Append(newBreak)

              newRunForText = New Run(runAfterBookmark.OuterXml)
              DirectCast(newRunForText.LastChild, Text).Text = line.Trim

              curBookMark.Parent.Append(newRunForLf)
              curBookMark.Parent.Append(newRunForText)
          End If
      Next
Next

答案 9 :(得分:0)

接受的答案和其他一些人对书签在文档结构中的位置做出了假设。这是我的C#代码,它可以处理替换多个段落的书签正确替换不在段落边界开始和结束的书签。仍然不完美,但更接近......希望它有用。如果您找到更多改进方法,请进行修改!

    private static void ReplaceBookmarkParagraphs(MainDocumentPart doc, string bookmark, IEnumerable<OpenXmlElement> paras) {
        var start = doc.Document.Descendants<BookmarkStart>().Where(x => x.Name == bookmark).First();
        var end = doc.Document.Descendants<BookmarkEnd>().Where(x => x.Id.Value == start.Id.Value).First();
        OpenXmlElement current = start;
        var done = false;

        while ( !done && current != null ) {
            OpenXmlElement next;
            next = current.NextSibling();

            if ( next == null ) {
                var parentNext = current.Parent.NextSibling();
                while ( !parentNext.HasChildren ) {
                    var toRemove = parentNext;
                    parentNext = parentNext.NextSibling();
                    toRemove.Remove();
                }
                next = current.Parent.NextSibling().FirstChild;

                current.Parent.Remove();
            }

            if ( next is BookmarkEnd ) {
                BookmarkEnd maybeEnd = (BookmarkEnd)next;
                if ( maybeEnd.Id.Value == start.Id.Value ) {
                    done = true;
                }
            }
            if ( current != start ) {
                current.Remove();
            }

            current = next;
        }

        foreach ( var p in paras ) {
            end.Parent.InsertBeforeSelf(p);
        }
    }

答案 10 :(得分:0)

这是我最终得到的 - 不是100%完美,但适用于简单的书签和简单的文字插入:

private void FillBookmarksUsingOpenXml(string sourceDoc, string destDoc, Dictionary<string, string> bookmarkData)
    {
        string wordmlNamespace = "http://schemas.openxmlformats.org/wordprocessingml/2006/main";
        // Make a copy of the template file.
        File.Copy(sourceDoc, destDoc, true);

        //Open the document as an Open XML package and extract the main document part.
        using (WordprocessingDocument wordPackage = WordprocessingDocument.Open(destDoc, true))
        {
            MainDocumentPart part = wordPackage.MainDocumentPart;

            //Setup the namespace manager so you can perform XPath queries 
            //to search for bookmarks in the part.
            NameTable nt = new NameTable();
            XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
            nsManager.AddNamespace("w", wordmlNamespace);

            //Load the part's XML into an XmlDocument instance.
            XmlDocument xmlDoc = new XmlDocument(nt);
            xmlDoc.Load(part.GetStream());

            //Iterate through the bookmarks.
            foreach (KeyValuePair<string, string> bookmarkDataVal in bookmarkData)
            {
                var bookmarks = from bm in part.Document.Body.Descendants<BookmarkStart>()
                          select bm;

                foreach (var bookmark in bookmarks)
                {
                    if (bookmark.Name == bookmarkDataVal.Key)
                    {
                        Run bookmarkText = bookmark.NextSibling<Run>();
                        if (bookmarkText != null)  // if the bookmark has text replace it
                        {
                            bookmarkText.GetFirstChild<Text>().Text = bookmarkDataVal.Value;
                        }
                        else  // otherwise append new text immediately after it
                        {
                            var parent = bookmark.Parent;   // bookmark's parent element

                            Text text = new Text(bookmarkDataVal.Value);
                            Run run = new Run(new RunProperties());
                            run.Append(text);
                            // insert after bookmark parent
                            parent.Append(run);
                        }

                        //bk.Remove();    // we don't want the bookmark anymore
                    }
                }
            }

            //Write the changes back to the document part.
            xmlDoc.Save(wordPackage.MainDocumentPart.GetStream(FileMode.Create));
        }
    }