我有以下脚本:
import pandas as pd
gdf = pd.read_csv('Geolocation_main')
print gdf['Geolocation'][:5]
哪个输出:
0 (50.673675, -120.298973)
1 (50.678354, -120.329258)
2 (50.672496, -120.333317)
3 (50.673359, -120.332912)
4 (50.673411, -120.32978)
print type(gdf['Geolocation'][0])
<type 'str'>
我需要交换每个单元格中的地理坐标,例如(-120.298973,50.673675)。
为此,我编写了以下脚本:
correct = []
for u in gdf['Geolocation']:
u = u.replace('(', '')
u = u.replace(')', '')
a, b = u.split(',')
correct = b, a
gdf['Geolocation_correct'] = correct
print gdf['Geolocation_correct']
但是它给了我一个错误。 ValueError:值的长度与索引的长度不匹配。我在这里做什么错了?
答案 0 :(得分:1)
我认为您收到的错误是由于
correct = gdf['Geolocation_correct']
gdf['Geolocation_correct']
在那里没有定义。
您可以这样做:
def fix_geo_location( u):
u = u.replace('(', '')
u = u.replace(')', '')
a, b = u.split(',')
correct = "(%s,%s)" %(b,a)
return correct
df["Geolocation_correct"] = df["Geolocation"].map( fix_geo_location)
答案 1 :(得分:0)
或者您也可以这样做:
>>> df['Geolocation'].map(lambda a: str(tuple(map(float, a.strip('()').split(',')))[::-1]))
0 (-120.298973, 50.673675)
1 (-120.329258, 50.678354)
2 (-120.333317, 50.672496)
3 (-120.332912, 50.673359)
4 (-120.32978, 50.673411)
Name: Geolocation, dtype: object