我有一个文件,每行有一个json,格式如下:
{"id":13, "url":"https://sub.domain.com/path", "dm":"-", "ip":"192.168.0.1"}
{"id":14, "url":"sub.domain2.com/?param=value", "dm":"-", "ip":"192.168.0.1"}
{"id":15, "url":"domain.com/path", "dm":"prefilled.com", "ip":"192.168.0.1"}
我需要更换" dm":" - "使用同一行中的相应域来获取此输出:
{"id":13, "url":"https://sub.domain.com/path", "dm":"sub.domain.com", "ip":"192.168.0.1"}
{"id":14, "url":"sub.domain2.com/?param=value", "dm":"sub.domain2.com", "ip":"192.168.0.1"}
{"id":15, "url":"domain.com/path", "dm":"prefilled.com", "ip":"192.168.0.1"}
任何bash命令仅适用于具有" dm":" - "以优化的方式,因为文件长度超过10k行
答案 0 :(得分:4)
使用jq-1.5
(最新版本的atm),您可以执行以下操作:
jq 'if .dm == "-" then .dm = (.url|sub("https?://";"")|sub("/.*";"")) else . end' a.json
说明:
if .dm == "-" ... # Runs the following only if .dm exists and it's value is "-"
.dm=(...) # Assigns to .dm
.url|sub("^https?://"; "") # Takes .url and replaces http/https:// from the beginning
...|sub("/.*"; "") # Replaces everything after the first / (including it)
答案 1 :(得分:1)
使用GNU或OSX sed通过-E
获得ERE支持:
$ sed -E 's#(.*"url":"([^"]+\/\/)?([^"/]+).*"dm":")-"#\1\3"#' file
{"id":13, "url":"https://sub.domain.com/path", "dm":"sub.domain.com", "ip":"192.168.0.1"}
{"id":14, "url":"sub.domain2.com/?param=value", "dm":"sub.domain2.com", "ip":"192.168.0.1"}
{"id":15, "url":"domain.com/path", "dm":"domain.com", "ip":"192.168.0.1"}
使用GNU awk为第3个arg匹配():
$ awk 'match($0,/(.*"url":"([^"]+\/\/)?([^"/]+).*"dm":")-(".*)/,a){$0=a[1] a[3] a[4]} 1' file
{"id":13, "url":"https://sub.domain.com/path", "dm":"sub.domain.com", "ip":"192.168.0.1"}
{"id":14, "url":"sub.domain2.com/?param=value", "dm":"sub.domain2.com", "ip":"192.168.0.1"}
{"id":15, "url":"domain.com/path", "dm":"domain.com", "ip":"192.168.0.1"}
答案 2 :(得分:0)
您可以使用sed来执行此操作,但如果格式有任何变化,我建议您使用实际理解数据的内容:
sed -i -r 's/^(.*"url":")(.*\/\/)?(.*)(\/.*)"-"/\1\2\3\4"\3"/g' your_file