我想使用Google API在CSV文件中获取“位置”的纬度和经度,我可以使用Google API模块获得“lat”,“lng”。但我无法将文件保存回原始文件并插入“位置”
我的原始文件如下:
date time location birdName count birdName count birdName count
1990-02-10 0900:1200 balabala bird1 15 bird2 10 bird3 20
1990-02-28 1300:1500 balabala bird4 40 bird5 10 bird6 25
1990-03-01 0900-1200 balabala bird7 45 bird8 15 bird9 30
... ... ... ... ... ... ... ... ...
我想在'location'之后插入'lat'和'lng'列,如下所示:
date time location lat lng birdName count birdName count birdName count
1990-02-10 0900:1200 balabala xxx xxx bird1 15 bird2 10 bird3 20
1990-02-28 1300:1500 balabala xxx xxx bird4 40 bird5 10 bird6 25
1990-03-01 0900-1200 balabala xxx xxx bird7 45 bird8 15 bird9 30
... ... ... ... ... ... ... ... ... ... ...
Google API模块:https://drive.google.com/open?id=0B6SUWnrBmDwSb3BabFdEcXV3LUU&authuser=0
我的代码:
# -*- coding: utf-8 -*-
import pandas as pd
from geocodequery import GeocodeQuery
def addrs(location):
for addrs in location:
addr= addrs
gq = GeocodeQuery("zh-tw", "tw")
gq.get_geocode(addr)
lng=gq.get_lng()
lat=gq.get_lat()
df['lat']=lat
df['lng']=lng
df.to_csv('./birdsIwant.csv')
df = pd.read_csv('./birdsIwant.csv',low_memory=False)
addrs(df['location'])
我该怎么办?
答案 0 :(得分:0)
您可以使用花式索引更改列顺序:
In [179]:
# add the columns
df['lat'] = np.random.randn(len(df))
df['lng'] = np.random.randn(len(df))
df
Out[179]:
date time location birdName count birdName.1 count.1 \
0 1990-02-10 0900:1200 balabala bird1 15 bird2 10
1 1990-02-28 1300:1500 balabala bird4 40 bird5 10
2 1990-03-01 0900-1200 balabala bird7 45 bird8 15
birdName.2 count.2 lat lng
0 bird3 20 -0.915371 -1.508814
1 bird6 25 -0.716439 1.008078
2 bird9 30 0.609510 -1.185927
In [185]:
# get a list of the columns
col_list = list(df)
# insert column names at new positions
col_list.insert(3,'lat')
col_list.insert(4,'lng')
# slice off the last 2 values
col_list=col_list[:-2]
print(col_list)
['date', 'time', 'location', 'lat', 'lng', 'birdName', 'count', 'birdName.1', 'count.1', 'birdName.2', 'count.2']
In [187]:
# use ix and pass the new column order to sort the order
df = df.ix[:,col_list]
df
Out[187]:
date time location lat lng birdName count \
0 1990-02-10 0900:1200 balabala -0.915371 -1.508814 bird1 15
1 1990-02-28 1300:1500 balabala -0.716439 1.008078 bird4 40
2 1990-03-01 0900-1200 balabala 0.609510 -1.185927 bird7 45
birdName.1 count.1 birdName.2 count.2
0 bird2 10 bird3 20
1 bird5 10 bird6 25
2 bird8 15 bird9 30
修改强>
你的代码在每次迭代时写入csv,所以即使它确实设置了你在每次迭代时覆盖的正确lat和lng值,你也应该在函数外部写入csv。无论如何,以下更清楚,应该有效:
def addrs(location):
gq = GeocodeQuery("zh-tw", "tw")
gq.get_geocode(location)
return pd.Series([gq.get_lat(), gq.get_lng()])
df[['lat','lng']] = df['location'].apply(addrs)
df.to_csv('./birdsIwant.csv')