我希望找到离Craigslist apt列表最近的地铁站以及距离列表的距离(以英里为单位)。我想将其导出为.csv文件以供进一步分析。
我在python中写了以下内容:
`代码:
import csv
from geopy.distance import vincenty
from operator import itemgetter
with open('coord.csv') as csvfile:
#skip first line in csv
next(csvfile)
#read csv
readCSV = csv.reader(csvfile, delimiter=',')
#store results in a dictionary
subwayCoords = {}
#loop through each row in csv
for row in readCSV:
subway = row[1]
s_coord = row[0],row[3]
subwayCoords[subway] = s_coord
with open('items.csv') as csvfile:
next(csvfile)
readCSV = csv.reader(csvfile, delimiter=',')
craigCoords = {}
for row in readCSV:
craigID = row[1]
c_coord = row[11]
craigCoords[craigID] = c_coord
craigDist = {} #dictionary: distance between each listing and subway
craigMin = {} #dictionary: nearest subway to each listing
#get each listing's coordinates (key=listing, value=coordinates)
for craigID, c_coord in craigCoords.items():
#get each subway's coordinates (key=subway, value=coordinates)
for subway, s_coord in subwayCoords.items():
#calculate distance between each listing and subway
dist = vincenty(s_coord, c_coord).miles
print "distance between " + ''.join(str(craigID)) + " and " + ''.join(str(subway)) + " = " + str(dist)
craigDist[subway] = dist
#for each listing, calculate closest subway; returns subway, distance as a tuple
minPair = min(craigDist.iteritems(), key=itemgetter(1))
craigMin[craigID] = minPair
print craigMin
#export craigMin dictionary
with open('mycsvfile.csv','wb') as csvfile:
w = csv.writer(csvfile)
w.writerows(craigMin.items())
我现在有一个包含键值对的字典,如下所示:
{list :(最近的地铁站,距离),...}
运行时输出:
{
'6022151897': ('Kew Gardens\xe2\x80\x93Union Turnpike (IND Queens Boulevard Line)', 1.1243919326522678),
'6022258759': ('Forest Hills\xe2\x80\x9371st Avenue (IND Queens Boulevard Line)', 0.20148597888760844),
'6022892363': ('Vernon Boulevard\xe2\x80\x93Jackson Avenue (IRT Flushing Line)', 0.37261054608700767)
}
.csv输出:
6022151897,"('Kew Gardens\xe2\x80\x93Union Turnpike (IND Queens Boulevard Line)', 1.1243919326522678)"
6022258759,"('Forest Hills\xe2\x80\x9371st Avenue (IND Queens Boulevard Line)', 0.20148597888760844)"
6022892363,"('Vernon Boulevard\xe2\x80\x93Jackson Avenue (IRT Flushing Line)', 0.37261054608700767)"
请注意,该值包含2个值,而不是1个值。
如何将值解析为2个单独的值,以便我可以导出为.csv?任何其他提高脚本效率的技巧也将受到赞赏。
答案 0 :(得分:0)
试试这个,它应该压扁它并写下你的csv:
with open('mycsvfile.csv','wb') as csvfile:
w = csv.writer(csvfile)
for key, value in craigMin.items():
w.writerows([key, value[0], value[1]])
想法存在,你必须将字典中的项目分开。
这种方法的输出:
6022151897, Kew Gardens–Union Turnpike (IND Queens Boulevard Line), 1.124391933
在我解析你的评论之后你说了这个:
我试过了,我的.csv的输出看起来是一样的:
6022151897,"('Kew Gardens\xe2\x80\x93Union Turnpike (IND Queens Boulevard Line)', 1.1243919326522678)"
6022258759,"('Forest Hills\xe2\x80\x9371st Avenue (IND Queens Boulevard Line)', 0.20148597888760844)"
6022892363,"('Vernon Boulevard\xe2\x80\x93Jackson Avenue (IRT Flushing Line)',
0.37261054608700767)"
我正在寻找的是值[0]和值[1]中的干净文本。对于 例如,值[0] = Kew Gardens \ xe2 \ x80 \ x93Union Turnpike(IND Queens 大道线。没有额外的()或"或者'。同样,值[1] = 1.1243919326522678
我想传达的是,我给你的方法与你在问题中的代码非常不同。有一整个循环解压缩字典而不是将.items()元组推入csv。
我运行了你的代码并得到了你的结果,如果这就是你得到的输出,你可以做我所建议的,据我所知。
你能说出你的方法吗?#34;我试过这个......"