在一串字母和字符后解析JSON文件和输出字符串

时间:2017-05-25 07:26:55

标签: python json regex shell awk

该文件如下:

{"project":"platform/xxxxx/xxxxxx/build/repo","branch":"xxxxx_xx.xxxxx.xxx.1.0-dev","id":"T19797TIE76757IT78689899G","number":"1917095","subject":"xxxxx-2.0: blah blah blah","owner":{"name":"David","email":"david@xxxx.com","username":"david"},"url":"https://link_to_repo.com/1917095","createdOn":1493282302,"lastUpdated":1493813064,"sortKey":"000899786887","open":false,"status":"MERGED"

我需要号码" 1917095"在字符串" number"之后:"或者在字符串" https://link_to_repo.com/之后,所以即使字段的位置发生变化,输出也应仅提供这些数字。

我试图通过以下方式实现它:

awk -F'[,:"]' '{ print $23 }' file_name

给了我结果,但我需要找到一个更好的解决方案。

所以我需要帮助才能在python的帮助下实现这一目标(我是新手)或bash中的任何工具?

3 个答案:

答案 0 :(得分:0)

你需要这样的东西:

仅对文件中的json:

创建一个类似" string.py"的文件例如,并将以下代码放入其中:

$search = ([adsisearcher]"(&(objectCategory=person)(objectClass=User)(samaccountname=$ENV:USERNAME))").FindOne()

$user = $search.properties.name

$pwdlastset = [datetime]::FromFileTime($search.properties.pwdlastset[0]) 

$age = (New-TimeSpan –Start $pwdlastset –End (get-date)).Days

$expires =60-$age

if ($expires -lt 14){
Add-Type -AssemblyName 'System.Windows.Forms'
[System.Windows.Forms.MessageBox]::Show("Your password will expire in $expires days.", "Your password will expire in $expires days!",[System.Windows.Forms.MessageBoxButtons]::OK,[System.Windows.Forms.MessageBoxIcon]::Warning)
}

然后称之为:

$MaxPasswordAge = (Get-ADDefaultDomainPasswordPolicy).MaxPasswordAge.Days

示例:

#!/usr/bin/python                                                               

import sys, json                                                                

with open(sys.argv[1]) as data_file:                                            
    data = json.load(data_file)                                                 
print(data[sys.argv[2]])

使用字符串搜索:

python [file_name] [json_file] [json_attribute]

然后称之为:

python string.py test.json number

示例:

#!/usr/bin/python                                                               

import sys                                                                      

with open(sys.argv[1]) as data_file:                                            
    data = data_file.readlines()                                                

a = '"number":"'                                                                
b = '","subject"'                                                               
d = ''.join(data)                                                               
result = d.split(a)[-1].split(b)[0]                                             
print(result)

答案 1 :(得分:0)

正则表达式可用于仅选择数字。

$ awk '{ if (match($0,/number":"([0-9]*)"/,m)) print m[1] }' af.txt
1917095

如果需要从URL获取它。

$ awk '{ if (match($0,/link_to_repo.com\/([0-9]*)"/,m)) print m[1] }' af.txt
1917095

$ awk '{ if (match($0,/link_to_repo.com\/([[:digit:]]*)"/,m)) print m[1] }' af.txt
1917095

我非常努力地使用\d来查找数字。这是一个Perl扩展,在awk正则表达式中不可用。

如果这些不起作用,你的awk版本是什么? awk --version

答案 2 :(得分:0)

我对你的数据进行了一些改动,因为你最后错过了},而这段代码在python 3.6下进行了测试

data = {"project":"platform/xxxxx/xxxxxx/build/repo","branch":"xxxxx_xx.xxxxx.xxx.1.0-dev","id":"T19797TIE76757IT78689899G",
    "number":"1917095","subject":"xxxxx-2.0: blah blah blah","owner":{"name":"David","email":"david@xxxx.com","username":"david"},
    "url":"https://link_to_repo.com/1917095","createdOn":"1493282302","lastUpdated":"1493813064","sortKey":"000899786887","open":"false","status":"MERGED"}

 jsonobject = json.dumps(data)
 #print (jsonobject)
 jsonobjectToString = json.loads(jsonobject)
 #print (jsonobjectToString)

  print (jsonobjectToString["number"])


  =====
  1917095