Question

我正在尝试通过拉出两列（时间戳和网址）来解析日志文件，其中文件格式为：
1470700748 foo="narf1" url="http://narf2.com" bar="narf3"

除了时间戳之外，不保证列名的顺序相同。

获得时间表很容易：
grep -Eo '^[^ ]+' test.txt或
sed 's/ .*//' test.txt

我从未能够正确地拉动网址，也无法同时拉动它们 sed -n 's/.*url="$.*$".*/\1/p' test.txt

当没有空行时，上述工作，所以我也在努力将sed命令与：
组合 sed -e /^$/d test.txt

大多数其他SO帖子都处理固定列命令而我无法让它们正常工作。我尝试了很多各种grep，sed，awk和cut的排列。

有人做过类似的事吗？基于1470700748 foo =“narf1”url =“narf2”bar =“narf3”，我试图得到： 1470700748 http://narf2.com

Answer 1

你去......

$ grep -oP '^[0-9]+|(?<=url=")[^"]+' file | xargs

1470700748 http://narf2.com

Answer 2

$ sed -E -n 's/([^ ]+).* url="([^"]+).*/\1 \2/p' file
1470700748 http://narf2.com