我正在访问一个API,该API使我可以在给定的薪资期限内调入和调出员工的时间。此JSON输出包含大量我不需要的数据。我真正需要的只是员工ID号,他们的全名和当天的出勤记录。但是,我可以从中提取的最小数据集是整整一周。因此,如果我要获取今天的数据,则需要过滤一周中的其他几天。
到目前为止,我的方法一直是将所需的数据提取到字典中:基于特定键名的date_dict。问题是字典中充斥着每天的关键数据,而我只希望时间点包含今天的日期。下面是我用代码和生成代码的示例输出。下面是API输出的原始JSON数据
today = datetime.date.today().strftime("%Y-%m-%d")
parsed = "json data is here"
for item in parsed:
date_dict={}
date_dict['Employee']=item.get('Employee').get('EmployeeId')
date_dict['EmployeeName']=item.get('Employee').get('FullName')
date_dict['PunchInDateTime']=item.get('PunchInDateTime')
date_dict['PunchOutDateTime']=item.get('PunchOutDateTime')
print date_dict
{'Employee': u'080097', 'PunchInDateTime': u'2019-06-18T08:43:00', 'PunchOutDateTime': u'2019-06-18T13:43:00', 'EmployeeName': u'Peter Quill'}
{'Employee': u'080097', 'PunchInDateTime': u'2019-06-19T08:00:00', 'PunchOutDateTime': u'2019-06-19T09:16:00', 'EmployeeName': u'Peter Quill'}
[{
"Id": 12970292,
"Employee": {
"Id": 346968,
"Username": "starlord",
"FirstName": "Peter",
"LastName": "Quill",
"Email": "starlord@email.com",
"EmployeeId": "080097",
"IsActive": true,
"FullName": "Peter Quill",
"ProfileMiniImageUrl": "https://buddypunchapp.blob.core.windows.net/profileminipics/new_employee_face2.jpg"
},
"LocationId": null,
"LocationName": "",
"JobCodeId": null,
"JobCodeName": "",
"PunchInDateTime": "2019-06-18T08:43:00",
"PunchOutDateTime": "2019-06-18T13:43:00",
"PunchInApprovalStatusId": 4,
"PunchInApprovalStatusName": "Changed By Manager",
"PunchOutApprovalStatusId": 4,
"PunchOutApprovalStatusName": "Changed By Manager",
"PunchInIpAddress": "50.194.130.13",
"PunchOutIpAddress": "50.194.130.13",
"PunchInImageUrl": "",
"PunchOutImageUrl": "",
"Hours": 5.0,
"RegularHours": 5.0,
"OverTimeHours": 0.0,
"DoubleTimeHours": 0.0,
"PTOHours": null,
"Duration": "05:00:00",
"PTOEarningCodeId": null,
"PTOEarningCodeAbbr": "",
"BreakMinutes": 0,
"BreakApprovalStatusId": null,
"BreakApprovalStatusName": null,
"PunchOutLongitude": null,
"PunchInLongitude": null,
"PunchOutLatitude": null,
"PunchInLatitude": null,
"PunchInNotes": "",
"PunchOutNotes": ""
}, {
"Id": 12983841,
"Employee": {
"Id": 346968,
"Username": "starlord",
"FirstName": "Peter",
"LastName": "Quill",
"Email": "starlord@email.com",
"EmployeeId": "080097",
"IsActive": true,
"FullName": "Peter Quill",
"ProfileMiniImageUrl": "https://buddypunchapp.blob.core.windows.net/profileminipics/new_employee_face2.jpg"
},
"LocationId": null,
"LocationName": "",
"JobCodeId": null,
"JobCodeName": "",
"PunchInDateTime": "2019-06-19T08:00:00",
"PunchOutDateTime": "2019-06-19T09:16:00",
"PunchInApprovalStatusId": 4,
"PunchInApprovalStatusName": "Changed By Manager",
"PunchOutApprovalStatusId": 4,
"PunchOutApprovalStatusName": "Changed By Manager",
"PunchInIpAddress": "50.194.130.13",
"PunchOutIpAddress": "50.194.130.13",
"PunchInImageUrl": "",
"PunchOutImageUrl": "",
"Hours": 1.267,
"RegularHours": 1.267,
"OverTimeHours": 0.0,
"DoubleTimeHours": 0.0,
"PTOHours": null,
"Duration": "01:16:00",
"PTOEarningCodeId": null,
"PTOEarningCodeAbbr": "",
"BreakMinutes": 0,
"BreakApprovalStatusId": null,
"BreakApprovalStatusName": null,
"PunchOutLongitude": null,
"PunchInLongitude": null,
"PunchOutLatitude": null,
"PunchInLatitude": null,
"PunchInNotes": "",
"PunchOutNotes": ""
}]
基本上我的输出中是这样的:
{'Employee': u'080097', 'PunchInDateTime': u'2019-06-18T08:43:00', 'PunchOutDateTime': u'2019-06-18T13:43:00', 'EmployeeName': u'Peter Quill'}
{'Employee': u'080097', 'PunchInDateTime': u'2019-06-19T08:00:00', 'PunchOutDateTime': u'2019-06-19T09:16:00', 'EmployeeName': u'Peter Quill'}
我想得到的是因为打孔只包含今天的日期:
{'Employee': u'080097', 'PunchInDateTime': u'2019-06-19T08:00:00', 'PunchOutDateTime': u'2019-06-19T09:16:00', 'EmployeeName': u'Peter Quill'}
我只是不知道如何到达那里,因为我不能只删除键或值,因为那里有重复项,而且我确实需要employeeid和fullname值。而且我希望能够在一天内为所有员工执行此操作,因此我也无法对特定值进行硬编码。
编辑:这是脚本后面的一些代码,可循环访问字典并帮助准备要插入db的sql语句。目前,我只是在打印它,所以可以验证它是否正常工作。但是,当我通过它运行新的date_dict时,出现错误:TypeError:字符串索引必须是整数,而不是str
timepunches_dict = date_dict
for i, punch in enumerate(timepunches_dict):
punch_in = punch['PunchInDateTime']
punch_out = punch['PunchOutDateTime']
punch_in_sql = punch_in.replace('T', ' ')
punch_out_sql = punch_out.replace('T', ' ')
emp_id = punch['Employee']['EmployeeId']
emp_name = punch['Employee']['FullName']
if today in punch_in_sql:
if i == 0:
# ONLY RUN FOR FIRST ITERATION
print(emp_id, today, emp_name)
# RUN FOR ALL ITERATIONS
print(emp_id, today, i+1, punch_in_sql, punch_out_sql)
答案 0 :(得分:1)
如果您不在乎时间,则可以根据'PunchInDateTime'
拆分T
并在if条件下使用:
for item in parsed:
date_dict={}
date_dict['PunchInDateTime']=item.get('PunchInDateTime').split('T', 1)[0]
if date_dict['PunchInDateTime'] == today :
date_dict['Employee']=item.get('Employee').get('EmployeeId')
date_dict['EmployeeName']=item.get('Employee').get('FullName')
date_dict['PunchInDateTime']=item.get('PunchInDateTime')
date_dict['PunchOutDateTime']=item.get('PunchOutDateTime')
print date_dict
输出:
{'Employee': u'080097', 'PunchInDateTime': u'2019-06-19T08:00:00', 'PunchOutDateTime': u'2019-06-19T09:16:00', 'EmployeeName': u'Peter Quill'}