我想使用golang做一些模板,并且想要从像xlsx这样的xml中省略一些标签。像这样的xml源:
input := `<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Worksheet ss:Name="sheet1">
<Names>
<NamedRange ss:Name="_FilterDatabase" ss:RefersTo="=sheet!R3C1:R3C13"
ss:Hidden="1"/>
</Names>
<Table ss:ExpandedColumnCount="15" ss:ExpandedRowCount="7" x:FullColumns="1"
x:FullRows="1" ss:DefaultColumnWidth="52.8" ss:DefaultRowHeight="15.45">
<Column ss:AutoFitWidth="0" ss:Width="37.200000000000003"/>
<Column ss:AutoFitWidth="0" ss:Width="67.2"/>
<Column ss:AutoFitWidth="0" ss:Width="75.600000000000009"/>
<Column ss:AutoFitWidth="0" ss:Width="71.400000000000006"/>
<Row ss:AutoFitHeight="0">
<Cell ss:MergeAcross="12" ss:MergeDown="1" ss:StyleID="s63"><Data
ss:Type="String">This is a title of the sheet!</Data></Cell>
</Row>
<Row ss:AutoFitHeight="0">
<Cell ss:StyleID="s69"><Data ss:Type="String">{{range $prj:=.prj}}</Data></Cell>
<Cell ss:StyleID="s70"/>
<Cell ss:StyleID="s70"/>
<Cell ss:StyleID="s70"/>
</Row>
<Row ss:AutoFitHeight="0" ss:Height="45.449999999999996">
<Cell ss:StyleID="s72"/>
<Cell ss:StyleID="s70"><Data ss:Type="String">{{$prj.PrjName}}</Data></Cell>
<Cell ss:StyleID="s70"><Data ss:Type="String">{{$prj.ConstrDept}}</Data></Cell>
<Cell ss:StyleID="s71"><Data ss:Type="String">{{$prj.Assumer}}</Data></Cell>
<Cell ss:StyleID="s71"><Data ss:Type="String">{{$prj.ReplyNo}}</Data></Cell>
<Cell ss:StyleID="s71"><Data ss:Type="String">{{$prj.AnPingNo}}</Data></Cell>
</Row>
<Row ss:AutoFitHeight="0">
<Cell ss:StyleID="s73"><Data ss:Type="String">{{end}}</Data></Cell>
</Row>
</Table>
</Worksheet>
</Workbook>`
我希望如下:
{{range $ prj:=。prj}}
在这一行中,我只想获得&#34; {{range $prj:=.prj}}
&#34;并省略&#34; <Row>
&#34;
2
<Row ss:AutoFitHeight="0">
<Cell ss:StyleID="s73"><Data ss:Type="String">{{end}}</Data></Cell>
</Row>
在该行中,我只想获得&#34; {{end}}
&#34;并省略&#34; <Row>
&#34;
答案 0 :(得分:1)
你可以使用(如果我们放弃嵌套的{{}}
)正则表达式{{[^}]*?}}
和Regexp.FindAllString()
。
This example会提取预期结果:
re := regexp.MustCompile("{{[^}]*?}}")
res := re.FindAllString(input, -1)
for _, s := range res {
fmt.Println(s)
}
输出:
{{range $prj:=.prj}}
{{$prj.PrjName}}
{{$prj.ConstrDept}}
{{$prj.Assumer}}
...
但是如果数据取决于结构的上下文,那么简单的正则表达式不适合该任务(强制引用“The Center cannot Hold”)。
xml Marshall或Decoder会更好:pkg/encoding/xml/
,使用“Parse both XML element value and attributes for groups”这样的技术
请参阅this example:
type Data struct {
Type string `xml:"Type,attr"`
Value string `xml:",chardata"`
}
type Cell struct {
StyleID string `xml:"StyleID,attr"`
Data Data
}
type Row struct {
Afh string `xml:"AutoFitHeight,attr"`
Height string `xml:"Height,attr"`
Cells []Cell `xml:"Cell"`
}
type Column struct{}
type Table struct {
Rows []Row `xml:"Row"`
}
type Worksheet struct {
Table Table `xml:"Table"`
}
w := &Worksheet{}
err := xml.Unmarshal([]byte(input), &w)
if err != nil {
fmt.Printf("error: %v", err)
return
}
fmt.Printf("%+v\n", w)
这将提取所有单元格,您可以过滤包含所需数据的单元格(包含{{}}
的数据)
&{Table:
{ Rows:[
{Afh:0 Height:
Cells:[
{StyleID:s63 Data:{Type:String Value:This is a title of the sheet!}}
]}
{Afh:0 Height:
Cells:[
{StyleID:s69 Data:{Type:String Value:{{range $prj:=.prj}}}}
{StyleID:s70 Data:{Type: Value:}}
]}
{Afh:0 Height:45.449999999999996
Cells:[
{StyleID:s72 Data:{Type: Value:}}
{StyleID:s70 Data:{Type:String Value:{{$prj.PrjName}}}}
{StyleID:s70 Data:{Type:String Value:{{$prj.ConstrDept}}}}
{StyleID:s71 Data:{Type:String Value:{{$prj.Assumer}}}}
]}
{Afh:0 Height:
Cells:[
{StyleID:s73 Data:{Type:String Value:{{end}}}}
]}
]}}