大家好,有没有一种干净的方法可以将这些数据组织到正确的列中以便以后进行定位?
import pandas as pd
coordinates = {'event': ['1', '2', '3', '4'],
'direction': ['E', 'E,N', 'N,E', 'N'],
'location': ['316904', '314798,5812040', '5811316,314766', '5811309']}
df = pd.DataFrame.from_dict(coordinates)
df
看起来像什么:
Event direction location easting northing
1 E 316904 316904 NA
2 E,N 314798,5812040 314798 5812040
3 N,E 5811316,314766 314766 5811316
4 N 5811309 NA 5811309
我可以使用以下方法拆分位置:
df['easting'], df['northing'] = df['location'].str.split(',',1).str
但是当E THEN第一个值东移,第二个北移时,我需要条件 或当N个值大于第一个值时的条件 等等。
任何想法将不胜感激!
答案 0 :(得分:3)
解决方案1:
首先将split
列转换为新列,然后通过startswith
创建的布尔掩码交换值:
df[['easting','northing']] = df['location'].str.split(',',1, expand=True)
mask = df['direction'].str.startswith('N')
df.loc[mask, ['easting','northing']] = df.loc[mask, ['northing','easting']].values
print (df)
event direction location easting northing
0 1 E 316904 316904 None
1 2 E,N 314798,5812040 314798 5812040
2 3 N,E 5811316,314766 314766 5811316
3 4 N 5811309 None 5811309
解决方案2:
首先将值平整到辅助DataFrame
,然后使用pivot
,最后通过join
加入原始值:
from itertools import chain
direc = df['direction'].str.split(',')
loc = df['location'].str.split(',')
lens = loc.str.len()
df1 = pd.DataFrame({
'direction' : list(chain.from_iterable(direc.tolist())),
'loc' : list(chain.from_iterable(loc.tolist())),
'event' : df['event'].repeat(lens)
})
df2 = df1.pivot('event','direction','loc').rename(columns={'E':'easting','N':'northing'})
print (df2)
direction easting northing
event
1 316904 NaN
2 314798 5812040
3 314766 5811316
4 NaN 5811309
df = df.join(df2, on='event')
print (df)
event direction location easting northing
0 1 E 316904 316904 NaN
1 2 E,N 314798,5812040 314798 5812040
2 3 N,E 5811316,314766 314766 5811316
3 4 N 5811309 NaN 5811309