我有一个数据框,其中在单列(coordinates
中包含纬度,经度和海拔高度,我想将coordinates
列分为三列(纬度,经度和海拔高度)。
df:
ID Coordinates Region
1 latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n Europe
2 latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n Europe
3 latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n Europe
4 latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n Europe
5 latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n Europe
预期输出:
ID lat lon alt Region
1 52.00755721100514 12.565129548994266 185.23616827199143 Europe
2 52.00755721100514 12.565129548994266 185.23616827199143 Europe
3 52.00755721100514 12.565129548994266 185.23616827199143 Europe
4 52.00755721100514 12.565129548994266 185.23616827199143 Europe
5 52.00755721100514 12.565129548994266 185.23616827199143 Europe
我尝试过的事情:
我尝试首先以:
为基础拆分列,但是它不起作用:
df.loc[df['Coordinates'].isin(["latitude_degrees", "longitude_degrees"])]= ""
我也尝试替换了文本,但是它不起作用:
df.Coordinates.replace(to_replace=['latitude_degrees','longitude_degrees'],value='')
答案 0 :(得分:0)
让我们使用extractall
从lat
列中提取long
,alt
和Coordinates
,然后unstack
进行重塑,最后{ {1}}和join
和ID
列:
Region
c = df['Coordinates'].str.extractall(r'([\d.]+)')[0].unstack()
d = df[['ID', 'Region']].join(c.set_axis(['lat', 'long', 'alt'], 1))