Question

我已经从word模板创建了一个docx文件，现在我正在访问复制的docx文件，并希望用某些其他数据替换某些文本。

我无法获得如何从doument主要部分访问文本的提示？

任何帮助都会很明显。

以下是我的代码。

private void CreateSampleWordDocument()
    {
        //string sourceFile = Path.Combine("D:\\GeneralLetter.dot");
        //string destinationFile = Path.Combine("D:\\New.doc");
        string sourceFile = Path.Combine("D:\\GeneralWelcomeLetter.docx");
        string destinationFile = Path.Combine("D:\\New.docx");
        try
        {
            // Create a copy of the template file and open the copy
            File.Copy(sourceFile, destinationFile, true);
            using (WordprocessingDocument document = WordprocessingDocument.Open(destinationFile, true))
            {
                // Change the document type to Document
                document.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document);
                //Get the Main Part of the document
                MainDocumentPart mainPart = document.MainDocumentPart;
                mainPart.Document.Save();
            }
        }
        catch
        {
        }
    }

现在如何找到某些文字并替换相同的文字？我无法通过Link获得，所以一些代码提示会很明显。

Answer 1

为了让您了解如何操作，请尝试：

  using ( WordprocessingDocument doc =
                    WordprocessingDocument.Open(@"yourpath\testdocument.docx", true))
            {
                var body = doc.MainDocumentPart.Document.Body;
                var paras = body.Elements<Paragraph>();

                foreach (var para in paras)
                {
                    foreach (var run in para.Elements<Run>())
                    {
                        foreach (var text in run.Elements<Text>())
                        {
                            if (text.Text.Contains("text-to-replace"))
                            {
                                text.Text = text.Text.Replace("text-to-replace", "replaced-text");
                            }
                        }
                    }
                }
            }
        }

请注意，文字区分大小写。替换后不会更改文本格式。希望这会对你有所帮助。

Answer 2

除了Flowerking的回答：

当您的Word文件中包含文本框时，他的解决方案将无效。因为textbox具有TextBoxContent元素，所以它不会出现在Run s的foreach循环中。

但写作时

using ( WordprocessingDocument doc =
                    WordprocessingDocument.Open(@"yourpath\testdocument.docx", true))
{
    var document = doc.MainDocumentPart.Document

    foreach (var text in document.Descendants<Text>()) // <<< Here
    {
        if (text.Text.Contains("text-to-replace"))
        {
            text.Text = text.Text.Replace("text-to-replace", "replaced-text");
        }
    } 
}

它将循环文档中的所有文本（无论是否在文本框中），因此它将替换文本。

请注意，如果文本在运行或文本框之间拆分，这也不会起作用。对于这些情况，您需要更好的解决方案。

Answer 3

也许这个解决方案更容易：
1. StreamReader读取所有文本，
2.使用Regex大小写不敏感地替换新文本而不是旧文本 3. StreamWriter再次将修改后的文本写入文档。

 using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
{
    string docText = null;
    using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
        docText = sr.ReadToEnd();

    foreach (var t in findesReplaces)
        docText = new Regex(findText, RegexOptions.IgnoreCase).Replace(docText, replaceText);

    using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
        sw.Write(docText);
}

Answer 4

我正在测试这个用于生成文档，但我的占位符被拆分到运行和文本节点。我不想将整个文档作为正则表达式查找/替换的单个字符串加载，所以我使用了 OpenXml api。我的想法是：

清理占位符节点作为对文档的一次性操作
每次生成时按节点值查找/替换，现在源是干净的。

测试表明占位符被拆分为运行和文本节点，而不是段落。我还发现后续的占位符没有共享文本节点，所以我没有处理。占位符遵循模式 {{placeholder_name}}。

首先，我需要获取段落中的所有文本节点（根据@sertsedat）：

    var nodes = paragraph.Descendants<Text>();

测试表明，此函数保留了顺序，这非常适合我的用例，因为我可以遍历集合以查找开始/停止指示符，并将那些属于我的占位符的节点分组。

分组函数在文本节点值中查找 {{ 和 }}，以标识属于占位符的一部分并应删除的节点，以及应忽略的其他节点。

一旦找到一个节点的开始，所有后续节点，包括终止点，都需要被删除（通过添加到 TextNodes 列表来标记），这些节点的值包含在占位符 StringBuilder 以及第一个/最后一个节点的任何不属于占位符的文本部分也需要保存（因此是字符串属性）。发现新占位符时或在序列末尾时任何不完整的组都应引发错误。

最后，我用分组来更新原始文档

foreach (var placeholder in GroupPlaceholders(paragraph.Descendants<Text>()))
{
    var firstTextNode = placeholder.TextNodes[0];
    if (placeholder.PrecedingText != null)
    {
        firstTextNode.Parent.InsertBefore(new Text(placeholder.PrecedingText), firstTextNode);
    }
    firstTextNode.Parent.InsertBefore(placeholder.PlaceholderText, firstTextNode);
    if (placeholder.SubsequentText != null)
    {
        firstTextNode.Parent.InsertBefore(new Text(placeholder.SubsequentText), firstTextNode);
    }
    foreach (var textNode in placeholder.TextNodes) {
        textNode.Remove();                      
    }
}

Answer 5

如果您要查找的文本位于括号之间，Word会在多次运行中拆分文本...;

搜索文本（ienumerable（of text））

for (int i = 0; i <= SearchIn.Count - 1; i++) {

    if (!(i + 2 > SearchIn.Count - 1)) {
        Text TXT = SearchIn(i);
        Text TXT1 = SearchIn(i + 1);
        Text TXT2 = SearchIn(i + 2);

        if (Strings.Trim(TXT.Text) == "[" & Strings.Trim(TXT2.Text) == "]") {
            TXT1.Text = TXT.Text + TXT1.Text + TXT2.Text;

            TXT.Text = "";
            TXT2.Text = "";
        }
    }
}

Answer 6

这是一个解决方案，可以跨文本运行（包括文本框）在打开的xml（word）文档中查找和替换标签

namespace Demo
{
    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Linq;
    using System.Text.RegularExpressions;
    using DocumentFormat.OpenXml.Packaging;
    using DocumentFormat.OpenXml.Wordprocessing;

    public class WordDocumentHelper
    {
        class DocumentTag
        {
            public DocumentTag()
            {
                ReplacementText = "";
            }

            public string Tag { get; set; }
            public string Table { get; set; }
            public string Column { get; set; }
            public string ReplacementText { get; set; }

            public override string ToString()
            {
                return ReplacementText ?? (Tag ?? "");
            }
        }

        private const string TAG_PATTERN = @"\[(.*?)[\.|\:](.*?)\]";
        private const string TAG_START = @"[";
        private const string TAG_END = @"]";

        /// <summary>
        /// Clones a document template into the temp folder and returns the newly created clone temp filename and path.
        /// </summary>
        /// <param name="templatePath"></param>
        /// <returns></returns>
        public string CloneTemplateForEditing(string templatePath)
        {
            var tempFile = Path.Combine(Path.GetTempPath(), Path.GetRandomFileName()) + Path.GetExtension(templatePath);
            File.Copy(templatePath, tempFile);
            return tempFile;
        }

        /// <summary>
        /// Opens a given filename, replaces tags, and saves. 
        /// </summary>
        /// <param name="filename"></param>
        /// <returns>Number of tags found</returns>
        public int FindAndReplaceTags(string filename)
        {
            var allTags = new List<DocumentTag>();

            using (WordprocessingDocument doc = WordprocessingDocument.Open(path: filename, isEditable: true))
            {
                var document = doc.MainDocumentPart.Document;

                // text may be split across multiple text runs so keep a collection of text objects
                List<Text> tagParts = new List<Text>();

                foreach (var text in document.Descendants<Text>())
                {
                    // search for any fully formed tags in this text run
                    var fullTags = GetTags(text.Text);

                    // replace values for fully formed tags
                    fullTags.ForEach(t => {
                        t = GetTagReplacementValue(t);
                        text.Text = text.Text.Replace(t.Tag, t.ReplacementText);
                        allTags.Add(t);
                    });

                    // continue working on current partial tag
                    if (tagParts.Count > 0)
                    {
                        // working on a tag
                        var joinText = string.Join("", tagParts.Select(x => x.Text)) + text.Text;

                        // see if tag ends with this block
                        if (joinText.Contains(TAG_END))
                        {
                            var joinTag = GetTags(joinText).FirstOrDefault(); // should be just one tag (or none)
                            if (joinTag == null)
                            {
                                throw new Exception($"Misformed document tag in block '{string.Join("", tagParts.Select(x => x.Text)) + text.Text}' ");
                            }

                            joinTag = GetTagReplacementValue(joinTag);
                            allTags.Add(joinTag);

                            // replace first text run in the tagParts set with the replacement value. 
                            // (This means the formatting used on the first character of the tag will be used)
                            var firstRun = tagParts.First();
                            firstRun.Text = firstRun.Text.Substring(0, firstRun.Text.LastIndexOf(TAG_START));
                            firstRun.Text += joinTag.ReplacementText;

                            // replace trailing text runs with empty strings
                            tagParts.Skip(1).ToList().ForEach(x => x.Text = "");

                            // replace all text up to and including the first index of TAG_END
                            text.Text = text.Text.Substring(text.Text.IndexOf(TAG_END) + 1);

                            // empty the tagParts list so we can start on a new tag
                            tagParts.Clear();
                        }
                        else
                        {
                            // no tag end so keep getting text runs
                            tagParts.Add(text);
                        }
                    }

                    // search for new partial tags
                    if (text.Text.Contains("["))
                    {
                        if (tagParts.Any())
                        {
                            throw new Exception($"Misformed document tag in block '{string.Join("", tagParts.Select(x => x.Text)) + text.Text}' ");
                        }
                        tagParts.Add(text);
                        continue;
                    }

                }

                // save the temp doc before closing
                doc.Save();
            }

            return allTags.Count;
        }

        /// <summary>
        /// Gets a unique set of document tags found in the passed fileText using Regex
        /// </summary>
        /// <param name="fileText"></param>
        /// <returns></returns>
        private List<DocumentTag> GetTags(string fileText)
        {
            List<DocumentTag> tags = new List<DocumentTag>();

            if (string.IsNullOrWhiteSpace(fileText))
            {
                return tags;
            }

            // TODO: custom regex for tag matching 
            // this example looks for tags in the formation "[table.column]" or "[table:column]" and captures the full tag, "table", and "column" into match Groups
            MatchCollection matches = Regex.Matches(fileText, TAG_PATTERN);
            foreach (Match match in matches)
            {
                try
                {

                    if (match.Groups.Count < 3
                        || string.IsNullOrWhiteSpace(match.Groups[0].Value)
                        || string.IsNullOrWhiteSpace(match.Groups[1].Value)
                        || string.IsNullOrWhiteSpace(match.Groups[2].Value))
                    {
                        continue;
                    }

                    tags.Add(new DocumentTag
                    {
                        Tag = match.Groups[0].Value,
                        Table = match.Groups[1].Value,
                        Column = match.Groups[2].Value
                    });
                }
                catch
                {

                }
            }

            return tags;
        }

        /// <summary>
        /// Set the Tag replacement value of the pasted tag
        /// </summary>
        /// <returns></returns>
        private DocumentTag GetTagReplacementValue(DocumentTag tag)
        {
            // TODO: custom routine to update tag Replacement Value

            tag.ReplacementText = "foobar";

            return tag;
        }
    }
}

Answer 7

将Word文档作为WordprocessingDocument = WordprocessingDocument.Open（“ Chemin”，True，带有{.AutoSave = True}的新OpenSettings）

Dim d As Document = doc.MainDocumentPart.Document

Dim txt As Text = d.Descendants(Of Text).Where(Function(t) t.Text = "txtNom").FirstOrDefault

If txt IsNot Nothing Then
 txt.Text = txt.Text.Replace("txtNom", "YASSINE OULARBI")
End If

doc.Close()

Answer 8

我的班级替换了word文档中的长短语，该单词分成了不同的文本块：

自己上课：

using System.Collections.Generic;
using System.Text;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

namespace WebBackLibrary.Service
{
    public class WordDocumentService
    {
        private class WordMatchedPhrase
        {
            public int charStartInFirstPar { get; set; }
            public int charEndInLastPar { get; set; }

            public int firstCharParOccurance { get; set; }
            public int lastCharParOccurance { get; set; }
        }

        public WordprocessingDocument ReplaceStringInWordDocumennt(WordprocessingDocument wordprocessingDocument, string replaceWhat, string replaceFor)
        {
            List<WordMatchedPhrase> matchedPhrases = FindWordMatchedPhrases(wordprocessingDocument, replaceWhat);

            Document document = wordprocessingDocument.MainDocumentPart.Document;
            int i = 0;
            bool isInPhrase = false;
            bool isInEndOfPhrase = false;
            foreach (Text text in document.Descendants<Text>()) // <<< Here
            {
                char[] textChars = text.Text.ToCharArray();
                List<WordMatchedPhrase> curParPhrases = matchedPhrases.FindAll(a => (a.firstCharParOccurance.Equals(i) || a.lastCharParOccurance.Equals(i)));
                StringBuilder outStringBuilder = new StringBuilder();
                
                for (int c = 0; c < textChars.Length; c++)
                {
                    if (isInEndOfPhrase)
                    {
                        isInPhrase = false;
                        isInEndOfPhrase = false;
                    }

                    foreach (var parPhrase in curParPhrases)
                    {
                        if (c == parPhrase.charStartInFirstPar && i == parPhrase.firstCharParOccurance)
                        {
                            outStringBuilder.Append(replaceFor);
                            isInPhrase = true;
                        }
                        if (c == parPhrase.charEndInLastPar && i == parPhrase.lastCharParOccurance)
                        {
                            isInEndOfPhrase = true;
                        }

                    }
                    if (isInPhrase == false && isInEndOfPhrase == false)
                    {
                        outStringBuilder.Append(textChars[c]);
                    }
                }
                text.Text = outStringBuilder.ToString();
                i = i + 1;
            }

            return wordprocessingDocument;
        }

        private List<WordMatchedPhrase> FindWordMatchedPhrases(WordprocessingDocument wordprocessingDocument, string replaceWhat)
        {
            char[] replaceWhatChars = replaceWhat.ToCharArray();
            int overlapsRequired = replaceWhatChars.Length;
            int overlapsFound = 0;
            int currentChar = 0;
            int firstCharParOccurance = 0;
            int lastCharParOccurance = 0;
            int startChar = 0;
            int endChar = 0;
            List<WordMatchedPhrase> wordMatchedPhrases = new List<WordMatchedPhrase>();
            //
            Document document = wordprocessingDocument.MainDocumentPart.Document;
            int i = 0;
            foreach (Text text in document.Descendants<Text>()) // <<< Here
            {
                char[] textChars = text.Text.ToCharArray();
                for (int c = 0; c < textChars.Length; c++)
                {
                    char compareToChar = replaceWhatChars[currentChar];
                    if (textChars[c] == compareToChar)
                    {
                        currentChar = currentChar + 1;
                        if (currentChar == 1)
                        {
                            startChar = c;
                            firstCharParOccurance = i;
                        }
                        if (currentChar == overlapsRequired)
                        {
                            endChar = c;
                            lastCharParOccurance = i;
                            WordMatchedPhrase matchedPhrase = new WordMatchedPhrase
                            {
                                firstCharParOccurance = firstCharParOccurance,
                                lastCharParOccurance = lastCharParOccurance,
                                charEndInLastPar = endChar,
                                charStartInFirstPar = startChar
                            };
                            wordMatchedPhrases.Add(matchedPhrase);
                            currentChar = 0;
                        }
                    }
                    else
                    {
                        currentChar = 0;

                    }
                }
                i = i + 1;
            }

            return wordMatchedPhrases;

        }

    }
}

以及易于使用的示例：

public void EditWordDocument(UserContents userContents)
        {
            string filePath = Path.Combine(userContents.PathOnDisk, userContents.FileName);
            WordDocumentService wordDocumentService = new WordDocumentService();
            if (userContents.ContentType.Contains("word") && File.Exists(filePath))
            {
                string saveAs = "modifiedTechWord.docx";
                //
                using (WordprocessingDocument doc = WordprocessingDocument.Open(filePath, true)) //open source word file
                {
                    Document document = doc.MainDocumentPart.Document;
                    OpenXmlPackage res = doc.SaveAs(Path.Combine(userContents.PathOnDisk, saveAs)); // copy it
                    res.Close();
                }
                using (WordprocessingDocument doc = WordprocessingDocument.Open(Path.Combine(userContents.PathOnDisk, saveAs), true)) // open copy
                {
                    string replaceWhat = "{transform:CandidateFio}";
                    string replaceFor = "ReplaceToFio";
                    var result = wordDocumentService.ReplaceStringInWordDocumennt(doc, replaceWhat, replaceFor); //replace words in copy
                }
            }
        }

Answer 9

到目前为止，我发现最简单，最准确的方法是使用Open-Xml-PowerTools。就个人而言，我使用dotnet core，所以我使用this nuget package。

using OpenXmlPowerTools;
// ...

protected byte[] SearchAndReplace(byte[] file, IDictionary<string, string> translations)
{
    WmlDocument doc = new WmlDocument(file.Length.ToString(), file);

    foreach (var translation in translations)
        doc = doc.SearchAndReplace(translation.Key, translation.Value, true);

    return doc.DocumentByteArray;
}

使用示例：

var templateDoc = File.ReadAllBytes("templateDoc.docx");
var generatedDoc = SearchAndReplace(templateDoc, new Dictionary<string, string>(){
    {"text-to-replace-1", "replaced-text-1"},
    {"text-to-replace-2", "replaced-text-2"},
});
File.WriteAllBytes("generatedDoc.docx", generatedDoc);

有关更多信息，请参见Search and Replace Text in an Open XML WordprocessingML Document

Answer 10

here是来自msdn的解决方案。

来自那里的例子：

public static void SearchAndReplace(string document)
{
    using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(document, true))
    {
        string docText = null;
        using (StreamReader sr = new StreamReader(wordDoc.MainDocumentPart.GetStream()))
        {
            docText = sr.ReadToEnd();
        }

        Regex regexText = new Regex("Hello world!");
        docText = regexText.Replace(docText, "Hi Everyone!");

        using (StreamWriter sw = new StreamWriter(wordDoc.MainDocumentPart.GetStream(FileMode.Create)))
        {
            sw.Write(docText);
        }
    }
}

使用Open Xml替换Word文档中的文本

10 个答案: