解析一段数据,一次性获取id和电子邮件

时间:2013-01-03 19:55:16

标签: awk grep

我需要一些关于是否可以这样做的建议。我有以下数据(这只是一行,会有这些行的块)在运行命令后出现,有没有办法可以使用grep和awk并解析每一行以一次性获取号码和所有者电子邮件,如下所示

Output:-

12345 mbarry@int.qualcomm.com

Input:-
change I5e55796844350e543f8460c53ec6e755ebe663d4
  project: platform/vendor/company-proprietary/chip
  branch: master
  id: I5e55796844350e543f8460c53ec6e755ebe663d4
  number: 12345
  subject: chip: changes to tl logging structure
  owner:
    name: Gord barry
    email: mbarry@int.qualcomm.com
    username: mbarry
  url: https://review-android.quicinc.com/12345
  commitMessage: chip: changes to tl logging structure

                 The existing TL logging has been divided into three distinct modules:
                 TL_BA (14), TL_HO (13) and TL (existing module). Thus the log with
                 loglevel 5 in file chip_qct_tl_hostsupport.c can be viewed by issuing
                 the following command - iwpriv chip0 setchipdbg 13 5 1.

                 Change-Id: I5e55796844350e543f8460c53ec6e755ebe663d4
  createdOn: 2012-08-09 15:40:57 PDT
  lastUpdated: 2012-08-21 16:43:08 PDT
  sortKey: 001f390f00023ead
  open: true
  status: NEW
  currentPatchSet:
    number: 3
    revision: 922872178946a712ab9f04483bc93216573cec6e
    parents:
 [ae259408e6ab530be62e02fdeafef34834d68709]
    ref: refs/changes/17/12345/3
    uploader:
      name: Gord barry
      email: mbarry@int.qualcomm.com
      username: mbarry
    createdOn: 2012-08-21 16:43:08 PDT
    files:
      file: /COMMIT_MSG
      type: ADDED
    files:
      file: rich/CORE/TL/inc/tlDebug.h
      type: MODIFIED
    files:
      file: rich/CORE/TL/inc/chip_qct_tl.h
      type: MODIFIED
    files:
      file: rich/CORE/TL/src/chip_qct_tl.c
      type: MODIFIED
    files:
      file: rich/CORE/TL/src/chip_qct_tl_ba.c
      type: MODIFIED
    files:
      file: rich/CORE/TL/src/chip_qct_tl_hosupport.c
      type: MODIFIED
    files:
      file: rich/CORE/VOSS/inc/vos_types.h
      type: MODIFIED
    files:
      file: rich/CORE/VOSS/src/vos_trace.c
      type: MODIFIED
    files:
      file: rich/CORE/WDA/src/chip_qct_wda_ds.c
      type: MODIFIED
    files:
      file: rich/CORE/WDI/TRP/DTS/src/chip_qct_wdi_dts.c
      type: MODIFIED

2 个答案:

答案 0 :(得分:3)

一次性使用grep

$ grep -Po '(?<=(email|umber): )\S+' file
12345 
mbarry@int.qualcomm.com
3 
mbarry@int.qualcomm.com

使用xargs -n2将两者放在一行:

$ grep -Po '(?<=(email|umber): )\S+' file | xargs -n2
12345 mbarry@int.qualcomm.com
3 mbarry@int.qualcomm.com

$ grep -Po '(?<=(email|umber): )\S+' tfile | paste - -
12345   mbarry@int.qualcomm.com
3       mbarry@int.qualcomm.com

说明:

positive lookbehind '(?<=a)b'匹配b后跟a。在您的情况下,您希望匹配email:number:后面的字符串,但正面背后必须是固定长度,因此我们必须删除n的数字。 \S+匹配一个或多个非空白字符。

(?<=   # Positive lookbehind 
(      # Group for alternation
email  # Literal string email
|      # Alternation (or)
umber  # Literal string umber
)      # Close 
:      # : Literal colon and single space 
)      # Close positive lookbehind 
\S+    # One or more non-whitespace character

使用awk

$ awk -F: '/email|number/{print $2}' file | xargs -n2
12345 mbarry@int.qualcomm.com
3 mbarry@int.qualcomm.com

答案 1 :(得分:1)

试试这些:

awk -F'[[:space:]:]+' '{a[$2]=$3} END{ print a["number"], a["email"] }' file
awk -F'[[:space:]:]+' '{a[$2]=$3} /email:/{ print a["number"], a["email"] }' file
awk -F'[[:space:]:]+' '{a[$2]=$3} /email:/{ print a["number"], a["email"]; exit }' file

如果这些都不是您正在寻找的,那么请提供您正在寻找的更多详细信息。

以上是上面的最后一个脚本如何使用发布的示例输入:

$ head -15 file
change I5e55796844350e543f8460c53ec6e755ebe663d4
  project: platform/vendor/company-proprietary/chip
  branch: master
  id: I5e55796844350e543f8460c53ec6e755ebe663d4
  number: 12345
  subject: chip: changes to tl logging structure
  owner:
    name: Gord barry
    email: mbarry@int.qualcomm.com
    username: mbarry
  url: https://review-android.quicinc.com/12345
  commitMessage: chip: changes to tl logging structure

                 The existing TL logging has been divided into three distinct modules:
                 TL_BA (14), TL_HO (13) and TL (existing module). Thus the log with

$ awk -F'[[:space:]:]+' '{a[$2]=$3} /email:/{ print a["number"], a["email"]; exit }' file
12345 mbarry@int.qualcomm.com