我有大量文本格式的文档,我想把它放到JSON中。这是我的解析器,使用“/ n”在每行之后将文本拆分为一个新的JSON字符串,我想更改它以剪切每个段落。
package main
import (
"bufio"
"encoding/json"
"fmt"
"io"
"log"
"os"
"strings"
)
func main() {
myBigThing := make(map[string]map[string]string)
f, _ := os.Open("strangecountess.txt")
r := bufio.NewReader(f)
var currentPage map[string]string
pageNum := 0
for {
line, err := r.ReadString('\n')
if err != nil {
if err != io.EOF {
log.Println("Error in parsing :", err)
}
break
}
if currentPage == nil {
currentPage = make(map[string]string)
myBigThing[fmt.Sprintf("page%d", pageNum)] = currentPage
pageNum++
} else if line == "" {
currentPage = nil
} else {
tokens := strings.Split(line, ":")
if len(tokens) == 2 {
currentPage[tokens[0]] = tokens[1]
}
}
}
f, err := os.Create("strangecountess.json")
if err != nil {
log.Println("Error :", err)
return
}
defer f.Close()
bout, _ := json.Marshal(myBigThing)
f.Write(bout)
}
我愿意为这个特定的任务更改语言,如果有一些很棒的库可以做到这一点,我很满意。但是留在go中是首选:)。
答案 0 :(得分:0)
如果您对其他工具持开放态度,jq可能会满足您的需求。
假设文件data
包含
When in the course of human events it becomes necessary for one people to dissolve the political bands which have connected them with another and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation.
We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness. That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed.
命令
$ jq -MR '.' data
生成一个字符串序列,每个输入行一个:
"When in the course of human events it becomes necessary for one people to dissolve the political bands which have connected them with another and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation."
"We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness. That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed."
命令
$ jq -MR -n '[inputs]' data
会将这些行收集到一个数组中:
[
"When in the course of human events it becomes necessary for one people to dissolve the political bands which have connected them with another and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation.",
"We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness. That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed."
]
一旦有了JSON对象,就可以轻松添加更多处理。例如这个过滤器
$ jq -MR -n '[inputs] | map("\(.[:30])... \(length) characters")' data
总结了每一行:
[
"When in the course of human ev... 404 characters",
"We hold these truths to be sel... 337 characters"
]
和这个命令
$ jq -MR -n 'reduce inputs as $i ({}; .["\(.|length)"]=$i)' data
将线条收集到对象中
{
"0": "When in the course of human events it becomes necessary for one people to dissolve the political bands which have connected them with another and to assume among the powers of the earth, the separate and equal station to which the Laws of Nature and of Nature's God entitle them, a decent respect to the opinions of mankind requires that they should declare the causes which impel them to the separation.",
"1": "We hold these truths to be self-evident, that all men are created equal, that they are endowed by their Creator with certain unalienable Rights, that among these are Life, Liberty and the pursuit of Happiness. That to secure these rights, Governments are instituted among Men, deriving their just powers from the consent of the governed."
}
https://jqplay.org/也有在线版本。