如何在python中将window-1255数据从Excel文件正确转换为utf-8?

时间:2018-10-12 13:25:26

标签: python excel encoding character-encoding

我正在编写一个脚本,以将导出的日程表从ibasketball.co.il转换为要共享的ICS日程表。

源文件:https://github.com/gilzellner/ibasketball_to_ics/blob/master/exported_data_2018-10-12_12-53.xls

代码在这里:github repo

数据适合日期和时间,但希伯来语字符显示如下: u'\ u05d1 \ u05d9 \ u05ea“ \ u05e8 \ u05db \ u05e4 \ u05e8 \ u05d9 \ u05d5 \ u05e0 \ u05d4 \ u05e9 \ u05dc \ u05d5 \ u05dd'在u'\ u05de \ u05db \ u05d1 \ u05d9 u05d3 \ u05d4 \ u05e9 \ u05e8 \ u05d5 \ u05df'

这是我的代码:

import xlrd
from ics import Calendar, Event
from datetime import datetime, timedelta
import dateutil

workbook = xlrd.open_workbook('/home/gilzellner/Downloads/exported_data_2018-10-12_12-53.xls')
sheet = workbook.sheet_by_index(0)
schedule = []
for rx in range(sheet.nrows):
    if 'number' in str(sheet.row(rx)[0]):
        game = {}
        game['date'] = str(sheet.row(rx)[2]).replace('text:', '').replace('u', '').replace('\'', '')
        game['time'] = str(sheet.row(rx)[3]).replace('text:', '').replace('u', '').replace('\'', '')
        game['home'] = str(sheet.row(rx)[4]).replace('text:', '').encode('utf-8')
        game['away'] = str(sheet.row(rx)[5]).replace('text:', '').encode('utf-8')
        game['location'] = str(sheet.row(rx)[6]).replace('text:', '').encode('utf-8')
        schedule.append(game)

for game in schedule:
    game['start'] = datetime.strptime(game['date'] + ' ' + game['time'], '%d-%m-%Y %H:%M:%S')\
        .replace(tzinfo=dateutil.tz.tzoffset('IST', 3*60*60))
    game['end'] = game['start'] + timedelta(hours=2)

c = Calendar()
for game in schedule:
    e = Event()
    e.name = game['away'] + ' at ' + game['home']
    e.begin = game['start']
    e.end = game['end']
    e.location = game['location']
    c.events.add(e)

with open('my.ics', 'w') as my_file:
    my_file.writelines(c)

0 个答案:

没有答案