将Doc转换为XML

时间:2015-04-26 05:25:33

标签: xml vb.net

我使用vb.net制作了简单的程序转换doc文件到xml文件。

Dim app As Word.Application = New Word.Application
Dim doc As Word.Document = app.Documents.Open(txtFileName.Text)

Dim writer As New XmlTextWriter("product.xml", System.Text.Encoding.UTF8)
writer.WriteStartDocument(True)
writer.WriteStartElement("JUDGEMENT")
writer.Formatting = Formatting.Indented

For Each paragraph As Word.Paragraph In doc.Paragraphs
    paragraph.Next()
    writer.WriteStartElement("p")

    If (paragraph.Range.Font.Bold) Then
        writer.WriteStartElement("b")
        writer.WriteString(paragraph.Range.Text.Trim)
        writer.WriteString(paragraph.Range.Text)
        writer.WriteEndElement()
    Else
        writer.WriteString(paragraph.Range.Text)
    End If

    writer.WriteEndElement()
Next

writer.WriteEndElement()
writer.WriteEndDocument()
writer.Close()
app.Quit()

结果将是这样的。 问题是粗体标签不是粗体字,它放在句子的末尾。

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<JUDGEMENT>
  <p>
    <b>Lorem Ipsum is simply dummy text of the printing and typesetting industry.</b>
  </p>
  <p>
    <b>Lorem Ipsum is simply dummy text of the printing and typesetting industry.</b>
  </p>
</JUDGEMENT>

但我需要这样的结果

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<JUDGEMENT>
  <p>
    <b>Lorem Ipsum </b>is simply dummy text of the printing and typesetting industry.
  </p>
  <p>
    <b>Lorem Ipsum </b>is simply dummy text of the printing and typesetting industry.
  </p>
</JUDGEMENT>

我需要添加或更改哪些内容?

1 个答案:

答案 0 :(得分:0)

看起来你在段落末尾有结束标记。试试这种方式:

If (paragraph.Range.Font.Bold) Then
    writer.WriteStartElement("b")
    writer.WriteString(paragraph.Range.Text.Trim)
    writer.WriteEndElement() 
    writer.WriteString(paragraph.Range.Text)        
Else
    writer.WriteString(paragraph.Range.Text)
End If