awk:使用多个分隔符时保留原始字段分隔符

时间:2017-06-28 16:46:40

标签: python bash awk sed

我正在尝试重新编号myfile1.txt中的line_id字段,其中每行都有多个分隔符。最终目标是从这些数据中获取字典的python列表。所以每一行都会成为字典,所以分隔符“:”和“,”对我来说非常重要。

以下是myfile.txt的片段:

"line_id":57,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Entered the room"
"line_id":58,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Left the room"
"line_id":59,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Quit the group"
"line_id":60,"name":"Test File","seq_number":1,"user":"user2","text_entry":"Late to the party"
"line_id":61,"name":"Test File","seq_number":1,"user":"user2","text_entry":"Not responding"

以下awk语句运行良好,尽管我丢失了所有分隔符。它们被空格所取代。

awk -F [:,] '$2=$2-56' myfile1.json >> myfile2.txt

结果是:

"line_id" 1 "name" "Test File" "seq_number":26 "user" "user1" "text_entry" "Entered the room"
"line_id" 2 "name" "Test File" "seq_number":26 "user" "user1" "text_entry" "Left the room"
"line_id" 3 "name" "Test File" "seq_number":26 "user" "user1" "text_entry" "Quit the group"
"line_id" 4 "name" "Test File" "seq_number":1 "user" "user2" "text_entry" "Late to the party"
"line_id" 5 "name" "Test File" "seq_number":1 "user" "user2" "text_entry" "Not responding"

现在我不得不回到以下问题:并且,在适当的地方。我探索了sed,但没有找到在第二个字段上进行减法的简单方法。

我经历过this link,这对我的要求并没有多大帮助。 请指教。

1 个答案:

答案 0 :(得分:1)

  1. 使用逗号作为输入和输出字段分隔符
  2. 使用split中的awk函数拆分冒号
  3. 上的第一列
  4. 从拆分数组中的第二个元素中减去$1后重新填充56
  5. <强>代码:

    awk 'BEGIN{FS=OFS=","} {split($1, a, /:/); $1 = a[1] ":" a[2] - 56} 1' file
    
    "line_id":1,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Entered the room"
    "line_id":2,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Left the room"
    "line_id":3,"name":"Test File","seq_number":26,"user":"user1","text_entry":"Quit the group"
    "line_id":4,"name":"Test File","seq_number":1,"user":"user2","text_entry":"Late to the party"
    "line_id":5,"name":"Test File","seq_number":1,"user":"user2","text_entry":"Not responding"