我有一个电影名称和电影数据的.TSV文件,我正在使用PYDOT包进行分析。该文件已链接Here。包含用于创建它的JSON的文件链接为Here。
该文件是从解析的JSON编写的,并使用utf-8编码编写。虽然文件写得正确,但当我把它读回Python时,解释器似乎始终停留在下一行:
'Taken\t["Liam Neeson", " Maggie Grace", " Jon Gries", " David Warshofsky"]\n'
'The Walking Dead\t["Andrew Lincoln", " Steven Yeun", " Chandler Riggs",'
输出应如下所示,并在文件中写入:
Taken ["Liam Neeson", " Maggie Grace", " Jon Gries", " David Warshofsky"]
The Walking Dead ["Andrew Lincoln", " Steven Yeun", " Chandler Riggs", " Norman Reedus"]
Toy Story 3 ["Tom Hanks", " Tim Allen", " Joan Cusack", " Ned Beatty"]
这是用于创建文本文件的代码:
step3v2=open('step3.txt', 'rU')
step4=codecs.open('step4.txt', mode='w', encoding='utf-8')
data=[]
merged=''
for line in step3v2:
data.append(json.loads(line))
for row in data:
moviename=row[u'Title']
row[u'Actors']=row[u'Actors'].split(',')
actors=json.dumps(row[u'Actors']) + '\r\n'
merged+=moviename + '\t'
merged+=actors
step4.write(merged)
以下是读取文件的代码:
graph=pydot.Dot(graph_type='graph', charset='utf8')
step4v2=open('step4.txt', 'rU')
textfile=step4v2.readlines()
for line in textfile:
print repr(line)
答案 0 :(得分:1)
step4v2=open('step4.txt', 'rU') #this means universal newlines
应该是
step4v2=open('step3.txt', 'rb') #this means read the binary data
使用您链接的dropbox上的文件
>>> f =open (os.path.expanduser("~\\Downloads\\step4.txt"),"rb")
>>> for line in f: print repr(line)
似乎工作正常