使用C#在Word文档中查找特殊文本

时间:2020-07-30 17:18:39

标签: c# ms-word

我尝试查找所有格式为“ C” +数字+(“:”或“。”)的文本

例如:“ C 1”,“ C 2”,“ C 3”,...

在Word文档中。

Application wordApp = new Application();
Document wordDoc = wordApp.Documents.Open(inputFileO);

Range rngFind = wordDoc.Range();
string regex = "C [0 - 9]{ 1,3}[:.]";
while (rngFind.Find.Execute(regex))
{
    //and show what i have found here
}

但是它没有运行... 如何使用C#互操作Word找到它们?

1 个答案:

答案 0 :(得分:1)

如果您要检索的值采用以下格式:

例如:“ C 1”,“ C 2”,“ C 3”,...

然后您的正则表达式将永远不匹配。请尝试以下操作:C [0-9]{1,3}[:.]

注意:我用以下几行创建了一个word文档:

C 134:
C 1:
C 155:

此代码打开doc一词并应用正则表达式:

using Microsoft.Office.Interop.Word;
...
    static void Main(string[] args)
    {
        //declare list to store lines of data from the docx file.
        List<string> data = new List<string>();
        Application app = new Application();
        
        Document doc = app.Documents.Open(ref inputFileO);
        //loop through the paragraphs in the docx file and store conents in the list.
        foreach (Paragraph objParagraph in doc.Paragraphs)
            data.Add(objParagraph.Range.Text.Trim());

        ((_Document)doc).Close();
        ((_Application)app).Quit();

        string regex = "C [0-9]{1,3}[:.]";
        //loop through the lines, write matches to console.
        //added counter
        int lineCNT = 1;
        foreach (string line in data)
        {
            Match match = Regex.Match(line, regex);
            if (match.Success)
            {
                string key = match.Groups[0].Value;
                Console.WriteLine(key + " on line " + lineCNT);
            }
            lineCNT++;
        }
        Console.Read();
    }

控制台输出为:

C 134: on line 1
C 1: on line 2
C 155: on line 3

在此处查看其匹配项:regexr.com/59cfp