我有一个excel file,对于每一行都有一个事件,在这个excel文件中我想创建一个字典,例如
{ 'summary' : '223051-0011 Advanced Macroeconomics I',
'location' : 'C5-d'
}
等等。
我遇到的问题是我不知道我不能遍历每一列,并且每个列都创建一个单独的字典,这就是我试图解决这个挑战的方法:
excel_file = pd.read_excel(
'http://administracja.sgh.waw.pl/en/dsm/schedules/session/Documents/SMMB%2020172%20-%20changes%2017.05.18.xls',
encoding='utf-8', header=1)
n = 0
while n < len(excel_file.index) - 1:
events = excel_file.iloc[n]
n += 1
event_summary = {}
event_start = {}
event_end = {}
location = {}
event = {}
for row in events:
event['summary'] = events.iloc[0]
event_start['dateTime'] = events[6].replace(';', ' ') + events[3]
print(event_start)
event_end['dateTime'] = events[6].replace(';', ' ') + events[4]
答案 0 :(得分:0)
快速方法:
excel_file = pd.read_excel('http://administracja.sgh.waw.pl/en/dsm/schedules/session/Documents/SMMB%2020172%20-%20changes%2017.05.18.xls',
encoding='utf-8')
# note header=1 was removed as this infers the incorrect header
index_dict = dict()
for index, row in excel_file.iterrows():
index_dict.update({index: {'subject': row['Subject'],
'place': row['Place']}})
这为您提供了一个主词典,将行索引映射到您感兴趣的信息的摘要词典。例如,index_dict[0]
给出了第一行的摘要信息字典,依此类推。您可以在上面的for循环中进行任何其他预处理。
为了一次性对整个数据帧更有效地做到这一点:
index_dict = excel_file.to_dict(orient='index')
这为主词典提供了条目:
{0: {'Date': '11-06-18;',
'Day': 'Monday',
'End': '15:10',
'Place': 'A-210',
'Start': '13:30',
'Subject': '223051-0011 Advanced Macroeconomics I',
'Teacher': 'Adamowicz Elżbieta - 0011'},
1: {'Date': '18-06-18;',
'Day': 'Monday',
'End': '15:10',
'Place': 'C-5d comp.',
'Start': '13:30',
'Subject': '223471-0131 Event History Analysis With SAS',
'Teacher': 'Frątczak Ewa-0131'}, ...
}