我的问题是如何让python读取文本为16位字符的文件。帖子的其余部分描述了这种情况。
我有一个文本文件,它是从iTunes导出的播放列表。 这是一个包含标题
的简短部分Name Artist Composer Album Grouping Work Movement Number Movement Count Movement Name Genre Size Time Disc Number Disc Count Track Number Track Count Year Date Modified Date Added Bit Rate Sample Rate Volume Adjustment Kind Equalizer Comments Plays Last Played Skips Last Skipped My Rating
Keyboard Works of the Masters Randolph Hokanson Pan125b 2054816 64 03/11/2017, 18:00 03/11/2017, 17:01 256 44100 MPEG audio file 1 03/11/2017, 17:02 4 08/03/2018, 16:07
08 Traccia 08 11159905 464 03/11/2017, 17:39 03/11/2017, 16:59 192 48000 MPEG audio file 1 03/11/2017, 16:59
09 Traccia 09 17787361 741 03/11/2017, 17:39 03/11/2017, 16:58 192 48000 MPEG audio file 5 08/03/2018, 10:58
10 Traccia 10 10128290 421 03/11/2017, 17:39 03/11/2017, 16:58 192 48000 MPEG audio file 1 03/11/2017, 16:58
当我使用此代码读取它时,程序挂起。 (我保存文件中的行数)。后面的十六进制转储似乎显示iTunes的导出是16位字符。
阅读文本文件的完整代码是
file_name="full path to file goes here"
f = open(file_name, "r")
i=227
for x in range(0, i):
line = f.readline()
当我将代码读入文本管理器时,选择所有文本,并将其粘贴到新文档中。代码工作正常。
原始文件的一部分文本转储如下所示,以
之后的新文件开头00000000: FF FE 4E 00 61 00 6D 00 65 00 09 00 41 00 72 00 ..N.a.m.e...A.r.
00000010: 74 00 69 00 73 00 74 00 09 00 43 00 6F 00 6D 00 t.i.s.t...C.o.m.
00000020: 70 00 6F 00 73 00 65 00 72 00 09 00 41 00 6C 00 p.o.s.e.r...A.l.
00000030: 62 00 75 00 6D 00 09 00 47 00 72 00 6F 00 75 00 b.u.m...G.r.o.u.
00000040: 70 00 69 00 6E 00 67 00 09 00 57 00 6F 00 72 00 p.i.n.g...W.o.r.
00000050: 6B 00 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00 k...M.o.v.e.m.e.
00000060: 6E 00 74 00 20 00 4E 00 75 00 6D 00 62 00 65 00 n.t. .N.u.m.b.e.
00000070: 72 00 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00 r...M.o.v.e.m.e.
00000080: 6E 00 74 00 20 00 43 00 6F 00 75 00 6E 00 74 00 n.t. .C.o.u.n.t.
00000090: 09 00 4D 00 6F 00 76 00 65 00 6D 00 65 00 6E 00 ..M.o.v.e.m.e.n.
000000A0: 74 00 20 00 4E 00 61 00 6D 00 65 00 09 00 47 00 t. .N.a.m.e...G.
000000B0: 65 00 6E 00 72 00 65 00 09 00 53 00 69 00 7A 00 e.n.r.e...S.i.z.
000000C0: 65 00 09 00 54 00 69 00 6D 00 65 00 09 00 44 00 e...T.i.m.e...D.
000000D0: 69 00 73 00 63 00 20 00 4E 00 75 00 6D 00 62 00 i.s.c. .N.u.m.b.
000000E0: 65 00 72 00 09 00 44 00 69 00 73 00 63 00 20 00 e.r...D.i.s.c. .
000000F0: 43 00 6F 00 75 00 6E 00 74 00 09 00 54 00 72 00 C.o.u.n.t...T.r.
新文件
0000: 4E 61 6D 65 09 41 72 74 69 73 74 09 43 6F 6D 70 Name.Artist.Comp
0010: 6F 73 65 72 09 41 6C 62 75 6D 09 47 72 6F 75 70 oser.Album.Group
0020: 69 6E 67 09 57 6F 72 6B 09 4D 6F 76 65 6D 65 6E ing.Work.Movemen
0030: 74 20 4E 75 6D 62 65 72 09 4D 6F 76 65 6D 65 6E t Number.Movemen
0040: 74 20 43 6F 75 6E 74 09 4D 6F 76 65 6D 65 6E 74 t Count.Movement
0050: 20 4E 61 6D 65 09 47 65 6E 72 65 09 53 69 7A 65 Name.Genre.Size
答案 0 :(得分:2)
您的文件开头看起来像UTF-16 - 请参阅Byte order marks - Wikipedia
使用
file_name="full path to file goes here"
with io.open(file_name,'r', encoding='utf-16-le') as f:
for line in f:
# do something with line
打开时。
逐行阅读时无需使用range()或readlines()。如果你真的需要使用行数:
for lineNr,line in enumerate(f):