获取多页ms单词文档

时间:2016-10-17 08:34:52

标签: ms-word

我有一个多页单词文档,我的目标是自定义页码格式,如'19 -x',其中x可以是1,2,5等。实际上,单词的页面字段不是我想要的。 所以我的解决方案是绘制一个形状,我可以在每个页面的第一段中通过打开xml插入一个像'19'或甚至'19 -1'的文本:

<w:pict>
                <v:shapetype id="_x0000_t202" coordsize="21600,21600" o:spt="202" path="m,l,21600r21600,l21600,xe">
                    <v:stroke joinstyle="miter" />
                    <v:path gradientshapeok="t" o:connecttype="rect" />
                </v:shapetype>
                <v:shape id="_x0000_s2050" type="#_x0000_t202" style="position:absolute;left:0;text-align:left;margin-left:-1in;margin-top:708pt;width:39.75pt;height:12.75pt;z-index:251658240;v-text-anchor:middle" filled="f">
                    <v:textbox inset="0,0,0,0">
                        <w:txbxContent>
                            <w:p w:rsidR="00172BF9" w:rsidRDefault="00172BF9" w:rsidP="00172BF9">
                                <w:r>
                                    <w:rPr>
                                        <w:rFonts w:hint="eastAsia" />
                                    </w:rPr>
                                    <w:t>19-1</w:t>
                                </w:r>
                            </w:p>
                        </w:txbxContent>
                    </v:textbox>
                </v:shape>
            </w:pict>

我想要做的是将值设置为<v:shape/>元素中样式属性的 margin-top 属性。到现在为止,我已经拥有了页脚的位置,只需将其命名为 fieldParaRangeTop

HeaderFooter footer = currentDocument.Sections[1].Footers[WdHeaderFooterIndex.wdHeaderFooterPrimary];
Range footerRange = footer.Range;
int left,top,width,height ;// px
currentDocument.ActiveWindow.GetPoint(out left,out top,out width,out   height,footerRange);
float fieldParaRangeTop = top * 3 / 4;// pt

现在,我需要获得每个页面中第一段的位置,因此我可以使用 fieldParaRangeTop 来减去它:

Paragraph fistParaEachPage = ...;// first para of each page;
int top;
currentDocument.ActiveWindow.GetPoint(out left,out top,out width,out height,fistParaEachPage.Range);
float fpTop = top * 3 / 4;

margin-top 的值为:

value = fieldParaRangeTop - fpTop;

这是新页码相对于每页第一段显示的地方。

结果是第一页中第一段的位置: fpTop = 104pt,但在第二页: fpTop = 178pt。

我的问题是,第一页中的段落和第二页中的段落都是每个页面中的第一段,为什么方法currentDocument.ActiveWindow.GetPoint(out left,out top,out width,out height,paragraph.Range)可能会输出不同的顶部价值?我怎样才能得到正确的值?

谢谢!

  

代码示例

public class Example
{
    public void TestLocation(){
        object missingObj = Type.Missing;
        object boolFalse = false;
        object boolValue = true;
        Application application = null;
        Document currentDocument = null;
        try{
            application = new Application();
            application.Visible = false;
            object path =System.IO.Path.Combine(System.Windows.Forms.Application.StartupPath, "test.docx");
            // open an *.docx with several pages
            currentDocument = application.Documents.Open(ref path,ref boolFalse,ref boolValue,ref boolFalse,
                                                         ref missingObj,ref missingObj,ref boolFalse,
                                                         ref missingObj,ref missingObj,ref missingObj,ref missingObj,ref boolValue,
                                                         ref missingObj, ref missingObj,ref boolFalse,ref missingObj);

            // just consider the first section
            Section currentSection = currentDocument.Sections[1];
            HeadersFooters footers = currentSection.Footers;

            // insert a page field in the footer,
            // so i can get it's position to locate our new page
            // number - a shape that will cover the page field -  with the format of '19' or '19-1'
            Field f = GetFieldInFooter(currentSection.Footers);
            if(f == null){
                Console.WriteLine("The page field has not been setted yet!");
                return ;
            }
            // the location of the page field in the word
            Location point = GetPositionOfRange(currentDocument,f.Result);
            float fieldParaRangeTop =  point.Y ;// in pt
            float fieldLeft = point.X ;// in pt

            // the first paragraph in the document, and is  also the first paragraph in the first page
            Range firstParaRange = currentSection.Range.Paragraphs[1].Range;
            point = GetPositionOfRange(currentDocument,firstParaRange);
            float firstParaTop = point.Y ;// in pt
            float firstParaLeft = point.X ;// in pt

            //
            // in the following code, i will get the first paragraph of each page in a for-loop,
            // and caculate their positions.
            //
            object unit = WdUnits.wdParagraph;
            object count = 1;
            StringBuilder text = new StringBuilder();
            string format = "location of the page number in page {0} is ({1},{2})";
            // the value is the one that would be set to the margin-top property of the style attribute in <v:shape/> element.
            float value = fieldParaRangeTop - firstParaTop;
            text.AppendLine(string.Format(format,1,(fieldLeft - firstParaLeft ),value));
            int pageNumber = 1;
            Console.WriteLine(">>>Count of paragraphs in this section: "+currentSection.Range.Paragraphs.Count);
            for(int i = 2;i<= currentSection.Range.Paragraphs.Count;i++){
                Paragraph para = currentSection.Range.Paragraphs[i];
                if(this.IsRangeOfPage(para.Range,pageNumber))
                    continue;
                pageNumber++;
                if(!this.IsRangeInTheSamePage(para.Range)){
                    // if para is across two pages, 
                    // then get it's next sibling as the first paragraph in this page
                    para = currentSection.Range.Paragraphs[i+1];
                }
                object start = true;
                currentDocument.ActiveWindow.ScrollIntoView(para.Range,start);
                point = GetPositionOfRange(currentDocument,para.Range);
                value = fieldParaRangeTop - point.Y;
                Console.WriteLine(string.Format(">>>the first paragraph of page {0} is \r\n{1}",pageNumber,para.Range.Text));
                text.AppendLine(string.Format(format,pageNumber,(fieldLeft - point.X),value));
            }
            Console.WriteLine(text.ToString());

            // other codes here for building an <w:pict/> element with the positions using OpenXMl dll

        } catch(Exception ee){
            Console.WriteLine(":::ERROR::::::"+ee.Message);
        }
        finally{
            if(application != null){
                currentDocument.Close();
                currentDocument = null;
                application.Quit();
                application = null;
            }
        }
    }

    private Field GetFieldInFooter(HeadersFooters footers){
        if(footers == null || footers.Count == 0)
        {
            Console.WriteLine("There is no footer in current section.");
            return null;
        }
        Field f = null;
        foreach(HeaderFooter footer in footers){
            Range footerRange = footer.Range;
            if(footerRange.Fields.Count == 0)
                continue;

            foreach(Field field in footerRange.Fields){
                if(field.Type == WdFieldType.wdFieldPage)
                {
                    f = field;
                    break;
                }
            }
            if(f != null)
                break;
        }
        return f;
    }

    private Location GetPositionOfRange(Document currentDocument,Range range){
        int left = 0;
        int top = 0;
        int width = 0;
        int height = 0;
        currentDocument.ActiveWindow.GetPoint(out left,out top,out width,out height,range);
        return new Location(left * 3/ 4,top* 3/ 4,width* 3/ 4,height* 3/ 4);
    }

    /// <summary>
    /// Determines if the range is in the page <code>currentPage</code>
    /// </summary>
    /// <param name="range"></param>
    /// <param name="currentPage"></param>
    /// <returns></returns>
    private bool IsRangeOfPage(Range range,int currentPage){
        object s = range.Start,e = range.End;
        try{
            Range sr = range.Document.Range(ref s,ref s);
            Range se = range.Document.Range(ref e,ref e);
            int page1 = (int)sr.Information[WdInformation.wdActiveEndPageNumber];
            int page2 = (int)se.Information[WdInformation.wdActiveEndPageNumber];
            if(page1 == page2 && page1 == currentPage)
                return true;
            return false;
        }catch(Exception){return true;}
    }
    /// <summary>
    /// Determines whether the <code>range</code> is in the same page, that is both the <code>Start</code> and <code>End</code> of this range 
    /// is in the same page. Or this range will across two pages. 
    /// </summary>
    /// <param name="range"></param>
    /// <returns></returns>
    private bool IsRangeInTheSamePage(Range range){
        object s = range.Start,e = range.End;
        Range sr = range.Document.Range(ref s,ref s);
        Range er = range.Document.Range(ref e,ref e);
        int page1 = (int)sr.Information[WdInformation.wdActiveEndPageNumber];
        int page2 = (int)er.Information[WdInformation.wdActiveEndPageNumber];
        if(page1 == page2 )
            return true;
        return false;
    }

}

public class Location{

    public Location(float x,float y,float width, float height){
        this.X = x;
        this.Y = y;
        this.Width = width;
        this.Height = height;
    }

    public float X;
    public float Y;
    public float Width;
    public float Height;

    public override string ToString()
    {
        return string.Format("[Location X={0}, Y={1}, Width={2}, Height={3}]", X, Y, Width, Height);
    }

}

1 个答案:

答案 0 :(得分:0)

事实上,每页中第一段的位置可能不同。在为方法(例如段落范围)调用方法Window.ScrollIntoView之后,滚动对象并更改其位置。以下是我的新解决方案:

  1. 计算第一页中从第一段到页脚范围的距离,我们称之为 defaultDistance defaultDistance = fieldParaRangeTop - fpTop;。通常,此值是常量。
    第一页中的第一段如下所示:The first paragraph in the first page
  2. 在其他页面中,由于第一段我不在左上角,我们需要找到它。
    其他页面的第一段可能如下所示:The first paragraph in other pages
    如图所示,我们需要在步骤4中计算相对距离(称之为 relativeDistance )。该值可能为0,因为第一段的第一行确实是第一行页。 此时,我们有一个新的距离(只需称之为 paraDistance ):paraDistance = defaultDistance - relativeDistance。这是第一段相对于页脚的值,页码形状相对于第一段。