C#从.XPS文档中提取文本

时间:2017-01-29 19:03:28

标签: c# c#-4.0

我一直在使用Another StackOverflow answer to this question作为解决此问题的参考,但是我遇到了一个问题。我在FixedDocumentSequence收到错误,说无法找到。我已经添加了对PresentationCorePresentationFrameworkWindowsBaseReachFramework的引用,我不太确定是否需要为{{1}添加其他引用}。

这是我的代码:

FixedDocumentSequence

1 个答案:

答案 0 :(得分:0)

[STAThread]
static void Main(string[] args)
{

    try
    {
        XpsDocument _xpsDocument = new XpsDocument(@"C:\Users\admin-\Desktop\testing.xps", System.IO.FileAccess.Read);
        IXpsFixedDocumentSequenceReader fixedDocSeqReader = _xpsDocument.FixedDocumentSequenceReader;
        IXpsFixedDocumentReader _document = fixedDocSeqReader.FixedDocuments[0];
        FixedDocumentSequence sequence = _xpsDocument.GetFixedDocumentSequence();
        string _fullPageText = "";

        for (int pageCount = 0; pageCount < sequence.DocumentPaginator.PageCount; ++pageCount)
        {
            IXpsFixedPageReader _page = _document.FixedPages[pageCount];
            StringBuilder _currentText = new StringBuilder();
            System.Xml.XmlReader _pageContentReader = _page.XmlReader;

            if (_pageContentReader != null)
            {
                while (_pageContentReader.Read())
                {
                    if (_pageContentReader.Name == "Glyphs")
                    {
                        if (_pageContentReader.HasAttributes)
                        {
                            if (_pageContentReader.GetAttribute("UnicodeString") != null)
                            {
                                _currentText.
                                  Append(_pageContentReader.
                                  GetAttribute("UnicodeString"));
                            }
                        }
                    }
                }
            }

            _fullPageText += _currentText.ToString();
        }
    }
    catch(Exception e)
    {

    }
}  

我认为代码没有太大变化,尝试添加帮助我读取xps的[STAThread],而且我仅使用上述引用来读取文件,我也遇到了相同的错误你得到了,但是以某种方式解决了它,你离结果更近了90%
另请参阅添加System.Windows.Documents;

所需的参考