我有多个json文件,看起来像下面的示例:
#sample json
{"urlCurrent":"https://www.website1.com/inside/377/388/408/8002.html?utm_source=source&utm_medium=Click&utm_campaign=123","id":"00001"}
{"urlCurrent":"https://127.0.0.1/inside/414/756/765/34984.html","id":"00002"}
{"urlCurrent":"https://msdn.anything.com/en-us","id":"00002"}
{"urlCurrent":"https://web.something.com/","id":"00002"}
我希望json成为:
#result json
{"urlCurrent":"https://www.website1.com/","id":"00001"}
{"urlCurrent":"https://127.0.0.1/","id":"00002"}
{"urlCurrent":"https://msdn.anything.com/","id":"00002"}
{"urlCurrent":"https://web.something.com/","id":"00002"}
我认为
sed -i 's/{regular expression}/\ /g' sample.json
用空格后替换任何东西,可以实现结果。但是,我不知道如何使用正则表达式来匹配我需要的模式。我也不知道应该搜索哪个关键字才能实现这一目标。
有没有办法截断urlCurrent成为我需要的结果? 提前谢谢!
12/23更新 这有效:
sed -E -i -r 's!(http|ftp|https)://([0-9a-zA-Z\.]+)([0-9a-zA-Z\/\.?#=_&%~+-]+)!\2!g' sample.json
答案 0 :(得分:1)
sed -i -r 's/(.*:\/\/?[^\/]+\/?)[^\"]*(.*)/\1\2/' sample.json