我有一个Python程序,其输出如下:
from bs4 import BeautifulSoup
html = `<h1>This is heading</h1> <p>this is parah <strong>strong</strong> that\'s how it works</p>`
parsed_html = BeautifulSoup(html, 'html.parser')
all_lines = parsed_html.findAll(text=True)
print(all_lines)
# ['This is heading', ' ', 'this is parah ', 'strong', " that's how it works"]
我试图在golang中实现相同的功能,但无法获得所需的输出。到目前为止,我已经尝试过:
import (
"fmt"
"strings"
"github.com/PuerkitoBio/goquery"
)
func parseHTML(body string) string {
p := strings.NewReader(body)
doc, _ := goquery.NewDocumentFromReader(p)
fmt.Println(doc.Text())
// output: This is heading this is parah strong thats how it works
}
答案 0 :(得分:0)
如果您可以自己实现功能,则看起来很简单。
只需删除所有标签“ ...”,然后继续在标签后附加“ ...”
这将为您提供与python输出完全相同的结果。