我使用regexp表达式从.xlsx文件中获取数据。但是我很穷,而且在regexp中更新。有人可以帮帮我吗?
package main
import (
"fmt"
"regexp"
)
func main() {
input := `
<sheetData>
<row r="2" spans="1:15">
<c r="A2" s="5" ><v>{{range .txt}}</v></c>
<c r="B2" s="5" t="s"><v>1</v></c>
<c r="C2" s="5" t="s"><v>2</v></c>
<c r="D2" s="5" t="s"><v>3</v></c>
<c r="E2" s="5" />
<c r="K2" s="6" t="s"><v>21</v></c>
</row>
<row r="3" spans="1:15">
<c r="A3" s="5" t="s"><v>0</v></c>
<c r="B3" s="5" t="s"><v>1</v></c>
<c r="C3" s="5" t="s"><v>2</v></c>
<c r="D3" s="5" t="s"><v>3</v></c>
<c r="E3" s="5" />
<c r="K3" s="6" t="s"><v>21</v></c>
</row>
</sheetData>`
r := regexp.MustCompile(`<row[^>]*?r="(\d+)"[^>].*?>.*?[(<v>(.*?)<\/v>.*?)]<\/row>`)
r2 := regexp.MustCompile(`<v>(.*?)</v>`)
row:=r.FindAllString(input,-1)
for _,v:=range row {
fmt.Println(r.ReplaceAllStringFunc(v, func(m string) string {
match:=r2.FindAllString(v,-1)
for kk,vv:=range match {
fmt.Println(kk,vv)
fmt.Println(r2.ReplaceAllString(v, ""))
}
}))
}
}
问题:
如何获取字符串{{range .txt}},并抛弃标记“...”
如何从r="3"
获取“3”,并从“
提前致谢!
答案 0 :(得分:3)
我认为regexp
是这项工作的错误工具。试试xml:
import "encoding/xml"
// Could probably pick better names for these.
type C struct {
XMLName xml.Name `xml:"c"`
V string `xml:"v"`
R string `xml:"r,attr"`
}
type Row struct {
XMLName xml.Name `xml:"row"`
C []C `xml:"c"`
}
type Result struct {
XMLName xml.Name `xml:"sheetData"`
Row []Row `xml:"row"`
}
v := Result{}
err := xml.Unmarshal([]byte(input), &v)
if err != nil {
fmt.Printf("error: %v", err)
return
}
for _, r := range v.Row {
for _, c := range r.C {
fmt.Printf("%v %v\n", c.V, c.R)
}
}
这将打印:
{{range .txt}} A2
1 B2
2 C2
3 D2
...