如何根据今天的日期从JSON输出中提取特定数据

时间:2019-06-19 22:47:20

标签: json python-2.7 dictionary

我正在访问一个API,该API使我可以在给定的薪资期限内调入和调出员工的时间。此JSON输出包含大量我不需要的数据。我真正需要的只是员工ID号,他们的全名和当天的出勤记录。但是,我可以从中提取的最小数据集是整整一周。因此,如果我要获取今天的数据,则需要过滤一周中的其他几天。

到目前为止,我的方法一直是将所需的数据提取到字典中:基于特定键名的date_dict。问题是字典中充斥着每天的关键数据,而我只希望时间点包含今天的日期。下面是我用代码和生成代码的示例输出。下面是API输出的原始JSON数据

today = datetime.date.today().strftime("%Y-%m-%d")
parsed = "json data is here"

for item in parsed:
    date_dict={}
    date_dict['Employee']=item.get('Employee').get('EmployeeId')
    date_dict['EmployeeName']=item.get('Employee').get('FullName')
    date_dict['PunchInDateTime']=item.get('PunchInDateTime')
    date_dict['PunchOutDateTime']=item.get('PunchOutDateTime')
    print date_dict
{'Employee': u'080097', 'PunchInDateTime': u'2019-06-18T08:43:00', 'PunchOutDateTime': u'2019-06-18T13:43:00', 'EmployeeName': u'Peter Quill'}
{'Employee': u'080097', 'PunchInDateTime': u'2019-06-19T08:00:00', 'PunchOutDateTime': u'2019-06-19T09:16:00', 'EmployeeName': u'Peter Quill'}
[{
  "Id": 12970292,
  "Employee": {
    "Id": 346968,
    "Username": "starlord",
    "FirstName": "Peter",
    "LastName": "Quill",
    "Email": "starlord@email.com",
    "EmployeeId": "080097",
    "IsActive": true,
    "FullName": "Peter Quill",
    "ProfileMiniImageUrl": "https://buddypunchapp.blob.core.windows.net/profileminipics/new_employee_face2.jpg"
  },
  "LocationId": null,
  "LocationName": "",
  "JobCodeId": null,
  "JobCodeName": "",
  "PunchInDateTime": "2019-06-18T08:43:00",
  "PunchOutDateTime": "2019-06-18T13:43:00",
  "PunchInApprovalStatusId": 4,
  "PunchInApprovalStatusName": "Changed By Manager",
  "PunchOutApprovalStatusId": 4,
  "PunchOutApprovalStatusName": "Changed By Manager",
  "PunchInIpAddress": "50.194.130.13",
  "PunchOutIpAddress": "50.194.130.13",
  "PunchInImageUrl": "",
  "PunchOutImageUrl": "",
  "Hours": 5.0,
  "RegularHours": 5.0,
  "OverTimeHours": 0.0,
  "DoubleTimeHours": 0.0,
  "PTOHours": null,
  "Duration": "05:00:00",
  "PTOEarningCodeId": null,
  "PTOEarningCodeAbbr": "",
  "BreakMinutes": 0,
  "BreakApprovalStatusId": null,
  "BreakApprovalStatusName": null,
  "PunchOutLongitude": null,
  "PunchInLongitude": null,
  "PunchOutLatitude": null,
  "PunchInLatitude": null,
  "PunchInNotes": "",
  "PunchOutNotes": ""
}, {
  "Id": 12983841,
  "Employee": {
    "Id": 346968,
    "Username": "starlord",
    "FirstName": "Peter",
    "LastName": "Quill",
    "Email": "starlord@email.com",
    "EmployeeId": "080097",
    "IsActive": true,
    "FullName": "Peter Quill",
    "ProfileMiniImageUrl": "https://buddypunchapp.blob.core.windows.net/profileminipics/new_employee_face2.jpg"
  },
  "LocationId": null,
  "LocationName": "",
  "JobCodeId": null,
  "JobCodeName": "",
  "PunchInDateTime": "2019-06-19T08:00:00",
  "PunchOutDateTime": "2019-06-19T09:16:00",
  "PunchInApprovalStatusId": 4,
  "PunchInApprovalStatusName": "Changed By Manager",
  "PunchOutApprovalStatusId": 4,
  "PunchOutApprovalStatusName": "Changed By Manager",
  "PunchInIpAddress": "50.194.130.13",
  "PunchOutIpAddress": "50.194.130.13",
  "PunchInImageUrl": "",
  "PunchOutImageUrl": "",
  "Hours": 1.267,
  "RegularHours": 1.267,
  "OverTimeHours": 0.0,
  "DoubleTimeHours": 0.0,
  "PTOHours": null,
  "Duration": "01:16:00",
  "PTOEarningCodeId": null,
  "PTOEarningCodeAbbr": "",
  "BreakMinutes": 0,
  "BreakApprovalStatusId": null,
  "BreakApprovalStatusName": null,
  "PunchOutLongitude": null,
  "PunchInLongitude": null,
  "PunchOutLatitude": null,
  "PunchInLatitude": null,
  "PunchInNotes": "",
  "PunchOutNotes": ""
}]

基本上我的输出中是这样的:

{'Employee': u'080097', 'PunchInDateTime': u'2019-06-18T08:43:00', 'PunchOutDateTime': u'2019-06-18T13:43:00', 'EmployeeName': u'Peter Quill'}
{'Employee': u'080097', 'PunchInDateTime': u'2019-06-19T08:00:00', 'PunchOutDateTime': u'2019-06-19T09:16:00', 'EmployeeName': u'Peter Quill'}

我想得到的是因为打孔只包含今天的日期:

{'Employee': u'080097', 'PunchInDateTime': u'2019-06-19T08:00:00', 'PunchOutDateTime': u'2019-06-19T09:16:00', 'EmployeeName': u'Peter Quill'}

我只是不知道如何到达那里,因为我不能只删除键或值,因为那里有重复项,而且我确实需要employeeid和fullname值。而且我希望能够在一天内为所有员工执行此操作,因此我也无法对特定值进行硬编码。

编辑:这是脚本后面的一些代码,可循环访问字典并帮助准备要插入db的sql语句。目前,我只是在打印它,所以可以验证它是否正常工作。但是,当我通过它运行新的date_dict时,出现错误:TypeError:字符串索引必须是整数,而不是str

timepunches_dict = date_dict

for i, punch in enumerate(timepunches_dict):
        punch_in = punch['PunchInDateTime']
        punch_out = punch['PunchOutDateTime']
        punch_in_sql = punch_in.replace('T', ' ')
        punch_out_sql = punch_out.replace('T', ' ')

        emp_id = punch['Employee']['EmployeeId']
        emp_name = punch['Employee']['FullName']

        if today in punch_in_sql:
            if i == 0:
            # ONLY RUN FOR FIRST ITERATION 
                print(emp_id, today, emp_name)

        # RUN FOR ALL ITERATIONS
            print(emp_id, today, i+1, punch_in_sql, punch_out_sql) 

1 个答案:

答案 0 :(得分:1)

如果您不在乎时间,则可以根据'PunchInDateTime'拆分T并在if条件下使用:

for item in parsed:
date_dict={}
date_dict['PunchInDateTime']=item.get('PunchInDateTime').split('T', 1)[0]

if  date_dict['PunchInDateTime'] == today :
    date_dict['Employee']=item.get('Employee').get('EmployeeId')
    date_dict['EmployeeName']=item.get('Employee').get('FullName')
    date_dict['PunchInDateTime']=item.get('PunchInDateTime')
    date_dict['PunchOutDateTime']=item.get('PunchOutDateTime')
    print date_dict

输出:

{'Employee': u'080097', 'PunchInDateTime': u'2019-06-19T08:00:00', 'PunchOutDateTime': u'2019-06-19T09:16:00', 'EmployeeName': u'Peter Quill'}