Python程序不会读取16位字符的{t}文件

时间:2018-04-10 16:59:09

标签: python-3.x file

我的问题是如何让python读取文本为16位字符的文件。帖子的其余部分描述了这种情况。

我有一个文本文件,它是从iTunes导出的播放列表。 这是一个包含标题

的简短部分
Name    Artist  Composer    Album   Grouping    Work    Movement Number Movement Count  Movement Name   Genre   Size    Time    Disc Number Disc Count  Track Number    Track Count Year    Date Modified   Date Added  Bit Rate    Sample Rate Volume Adjustment   Kind    Equalizer   Comments    Plays   Last Played Skips   Last Skipped    My Rating
Keyboard Works of the Masters   Randolph Hokanson       Pan125b                         2054816 64                      03/11/2017, 18:00   03/11/2017, 17:01   256 44100       MPEG audio file         1   03/11/2017, 17:02   4   08/03/2018, 16:07   
08 Traccia 08                                       11159905    464                     03/11/2017, 17:39   03/11/2017, 16:59   192 48000       MPEG audio file                 1   03/11/2017, 16:59   
09 Traccia 09                                       17787361    741                     03/11/2017, 17:39   03/11/2017, 16:58   192 48000       MPEG audio file                 5   08/03/2018, 10:58   
10 Traccia 10                                       10128290    421                     03/11/2017, 17:39   03/11/2017, 16:58   192 48000       MPEG audio file                 1   03/11/2017, 16:58   

当我使用此代码读取它时,程序挂起。 (我保存文件中的行数)。后面的十六进制转储似乎显示iTunes的导出是16位字符。

阅读文本文件的完整代码是

file_name="full path to file goes here"
f = open(file_name, "r")
i=227
for x in range(0, i):
        line = f.readline()

当我将代码读入文本管理器时,选择所有文本,并将其粘贴到新文档中。代码工作正常。

原始文件的一部分文本转储如下所示,以

之后的新文件开头
00000000: FF FE 4E 00 61 00 6D 00 65 00 09 00 41 00 72 00   ..N.a.m.e...A.r.
00000010: 74 00 69 00 73 00 74 00 09 00 43 00 6F 00 6D 00   t.i.s.t...C.o.m.
00000020: 70 00 6F 00 73 00 65 00 72 00 09 00 41 00 6C 00   p.o.s.e.r...A.l.
00000030: 62 00 75 00 6D 00 09 00 47 00 72 00 6F 00 75 00   b.u.m...G.r.o.u.
00000040: 70 00 69 00 6E 00 67 00 09 00 57 00 6F 00 72 00   p.i.n.g...W.o.r.
00000050: 6B 00 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00   k...M.o.v.e.m.e.
00000060: 6E 00 74 00 20 00 4E 00 75 00 6D 00 62 00 65 00   n.t. .N.u.m.b.e.
00000070: 72 00 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00   r...M.o.v.e.m.e.
00000080: 6E 00 74 00 20 00 43 00 6F 00 75 00 6E 00 74 00   n.t. .C.o.u.n.t.
00000090: 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00 6E 00   ..M.o.v.e.m.e.n.
000000A0: 74 00 20 00 4E 00 61 00 6D 00 65 00 09 00 47 00   t. .N.a.m.e...G.
000000B0: 65 00 6E 00 72 00 65 00 09 00 53 00 69 00 7A 00   e.n.r.e...S.i.z.
000000C0: 65 00 09 00 54 00 69 00 6D 00 65 00 09 00 44 00   e...T.i.m.e...D.
000000D0: 69 00 73 00 63 00 20 00 4E 00 75 00 6D 00 62 00   i.s.c. .N.u.m.b.
000000E0: 65 00 72 00 09 00 44 00 69 00 73 00 63 00 20 00   e.r...D.i.s.c. .
000000F0: 43 00 6F 00 75 00 6E 00 74 00 09 00 54 00 72 00   C.o.u.n.t...T.r.

新文件

0000: 4E 61 6D 65 09 41 72 74 69 73 74 09 43 6F 6D 70   Name.Artist.Comp
0010: 6F 73 65 72 09 41 6C 62 75 6D 09 47 72 6F 75 70   oser.Album.Group
0020: 69 6E 67 09 57 6F 72 6B 09 4D 6F 76 65 6D 65 6E   ing.Work.Movemen
0030: 74 20 4E 75 6D 62 65 72 09 4D 6F 76 65 6D 65 6E   t Number.Movemen
0040: 74 20 43 6F 75 6E 74 09 4D 6F 76 65 6D 65 6E 74   t Count.Movement
0050: 20 4E 61 6D 65 09 47 65 6E 72 65 09 53 69 7A 65    Name.Genre.Size

1 个答案:

答案 0 :(得分:2)

您的文件开头看起来像UTF-16 - 请参阅Byte order marks - Wikipedia

使用

file_name="full path to file goes here"

with io.open(file_name,'r', encoding='utf-16-le') as f:
    for line in f:
        # do something with line 

打开时。

逐行阅读时无需使用range()或readlines()。如果你真的需要使用行数:

    for lineNr,line in enumerate(f):