在Pandas的每一列中交换字符串元素

时间:2018-08-06 09:02:48

标签: python pandas

我有以下脚本:

import pandas as pd

gdf = pd.read_csv('Geolocation_main')
print gdf['Geolocation'][:5]

哪个输出:

0    (50.673675, -120.298973)
1    (50.678354, -120.329258)
2    (50.672496, -120.333317)
3    (50.673359, -120.332912)
4     (50.673411, -120.32978)

print type(gdf['Geolocation'][0])
<type 'str'>

我需要交换每个单元格中的地理坐标,例如(-120.298973,50.673675)。

为此,我编写了以下脚本:

correct = []

for u in gdf['Geolocation']:
    u = u.replace('(', '')
    u = u.replace(')', '')
    a, b = u.split(',')
    correct = b, a
    gdf['Geolocation_correct'] = correct
    print gdf['Geolocation_correct']

但是它给了我一个错误。 ValueError:值的长度与索引的长度不匹配。我在这里做什么错了?

2 个答案:

答案 0 :(得分:1)

我认为您收到的错误是由于

correct = gdf['Geolocation_correct']

gdf['Geolocation_correct']在那里没有定义。

您可以这样做:

def fix_geo_location( u):
    u = u.replace('(', '')
    u = u.replace(')', '')
    a, b = u.split(',')
    correct = "(%s,%s)" %(b,a)
    return correct

df["Geolocation_correct"] = df["Geolocation"].map( fix_geo_location)

答案 1 :(得分:0)

或者您也可以这样做:

>>> df['Geolocation'].map(lambda a: str(tuple(map(float, a.strip('()').split(',')))[::-1]))
0    (-120.298973, 50.673675)
1    (-120.329258, 50.678354)
2    (-120.333317, 50.672496)
3    (-120.332912, 50.673359)
4     (-120.32978, 50.673411)
Name: Geolocation, dtype: object