在Unix上,没有向操作系统添加任何内容(即仅使用 grep , awk , sed , cut ,等等),如何将以下输入分成几个文件(例如" _temp1.txt"," _temp2.txt"等等),从每个" CODEVIEW"线? 请注意,该行很可能以多个空格开头。
如果输入来自API而不是现有文件,该怎么办?
. . .
"events" : [ {
"id" : "123456",
"important" : true,
"codeView" : {
"lines" : [ {
"fragments" : [ {
"type" : "NORMAL_CODE",
"value" : "str = wrapper.getParameter("
}, {
"type" : "NORMAL_CODE",
"value" : ")"
} ],
"text" : "str = wrapper.getParameter("motif")"
} ],
"nested" : false
},
"probableStartLocationView" : {
"lines" : [ {
"fragments" : [ {
"type" : "STACKTRACE_LINE",
"value" : "<init>() @ JSONInputData.java:12"
} ],
"text" : "<init>() @ JSONInputData.java:92"
} ],
"nested" : false
},
"dataView" : {
"lines" : [ {
"fragments" : [ {
"type" : "TAINT_VALUE",
"value" : "CP"
} ],
"text" : "{{#taint}}CP{{/taint}}"
} ],
"nested" : false
},
"collapsedEvents" : [ ],
"dupes" : 0
}, {
"id" : "28861,28862",
"important" : false,
"type" : "P2O",
"description" : "String Operations Occurred",
"extraDetails" : null,
"codeView" : {
"lines" : [ {
"fragments" : [ {
"type" : "TEXT",
"value" : "Over the following lines of code, blah blah."
} ],
"text" : "Over the following lines of code, blah blah."
} ],
"nested" : false
},
"probableStartLocationView" : {
"lines" : [ {
"fragments" : [ {
"type" : "STACKTRACE_LINE",
"value" : "remplaceString() @ O_UtilCaractere.java:234"
} ],
"text" : "remplaceString() @ O_UtilCaractere.java:234"
}, {
"fragments" : [ {
"type" : "STACKTRACE_LINE",
"value" : "replaceString() @ O_UtilCaractere.java:333"
} ],
"text" : "replaceString() @ O_UtilCaractere.java:333"
}, {
"fragments" : [ {
"type" : "STACKTRACE_LINE",
"value" : "creerIncidentPaie() @ Incidents.java:444"
} ],
"text" : "creerIncidentPaie() @ Incidents.java:219"
}, {
"fragments" : [ {
"type" : "STACKTRACE_LINE",
"value" : "repliquerAbsenceIncident() @ Incidents.java:876"
} ],
"text" : "repliquerAbsenceIncident() @ IncidentsPaieMgr.java:882"
} ],
"nested" : false
},
"dataView" : {
"lines" : [ {
"fragments" : [ {
"type" : "TEXT",
"value" : "insert into TGE_INCIDENT...4&apos;, &apos;YYYYMMDD&apos;), &apos;A&apos;, &apos;"
}, {
"type" : "TAINT_VALUE",
"value" : "CP"
}, {
"type" : "TEXT",
"value" : "&apos;, &apos;&apos;, null, &apos;T&apos;, &apos;ADPTVT&apos;, to_date(&apos;2013012214..."
} ],
"text" : "insert into TGE_INCIDENT...4&apos;, &apos;YYYYMMDD&apos;), &apos;A&apos;, &apos;{{#taint}}CP{{/taint}}&apos;, &apos;&apos;, null, &apos;T&apos;, &apos;ADPTVT&apos;, to_date(&apos;2017062214..."
} ],
"nested" : false
}
. . .
答案 0 :(得分:4)
这将在任何awk中强有力地工作:
awk '/"codeView"/{close(out); out="_temp" ++c ".txt"} out!=""{print > out}' file
答案 1 :(得分:2)
尝试:
csplit -f _temp -b %d.tmp file '/codeView/' '{*}'
或者,如果数据来自其他一些程序:
my_api | csplit -f _temp -b %d.tmp - '/codeView/' '{*}'
-f _temp -b %d.tmp
这两个选项将分割文件的名称设置为您想要的格式。
file
将其替换为输入文件的名称。如果输入来自标准输入,请使用-
。
/codeView/
这是您要拆分的正则表达式。
'{*}'
这告诉csplit不要在第一场比赛停止,而是继续分裂。
答案 2 :(得分:2)
awk
救援!
$ awk '/"codeView"/{c++} {print > ("_temp" (c+0) ".txt")}' file
直到第一个匹配的标题将在第0个临时文件中。如果密钥可能出现在内容中,则可能更改模式匹配文字匹配$1=="\"codeView\""
您可以将数据传输到awk
脚本,而不是从文件中读取。
如果打开的文件太多,您可能需要在错误之前关闭它们。