我的多边形很少,并且这些多边形的某些点距离很远。我试图用pandas写一个csv,每个点和多边形之间的距离将分开。我明白了:
poly total inside outside dist
1000 2 0 2 [16015,5678]
1100 1 0 1 [5267]
我想得到:
poly total inside outside dist
1000 2 0 2 16015
1000 2 0 2 5678
1100 1 0 1 5267
在查看之前的q [How to write nth value of list into csv file
后,我尝试了以下操作distance =[]
for row in arcpy.da.SearchCursor(outSide, ["SHAPE@XY"]):
px, py = row[0]
zipPoint=point(px,py)
Distance.append(int(zipPoint.calDist(PolyCenter)))
for i in distance:
df.loc[polygon,"distance"]=distance
df.loc[zipCode,"Total"]=count
df.loc[zipCode,"Inside"]=insideNum
df.loc[zipCode,"Outside"]=outsideNum
但是它在csv中给了我相同的结果。任何帮助表示赞赏。
答案 0 :(得分:1)
创建数据框:
import pandas as pd
import numpy as np
df= pd.DataFrame({
'poly':[1000,1100],
'total':[2,1],
'inside':[0,0],
'outside':[2,1],
'dist':[[16015,5678],[5267]]
})
df = df[['poly','total','inside','outside','dist']]
df
Out[]:
poly total inside outside dist
0 1000 2 0 2 [16015, 5678]
1 1100 1 0 1 [5267]
<强>处理强>
new_df = pd.DataFrame({
col:np.repeat(df[col].values, df['dist'].str.len())
for col in df.columns.difference(['dist'])
}).assign(**{'dist':np.concatenate(df['dist'].values)})[df.columns.tolist()]
new_df
Out[]:
poly total inside outside dist
0 1000 2 0 2 16015
1 1000 2 0 2 5678
2 1100 1 0 1 5267
答案 1 :(得分:1)
您可以使用str.len
获取lists
的长度,numpy.repeat
使用flattening lists然后join
原始列重复这些长度:
from itertools import chain
s = pd.Series(list(chain.from_iterable(df.dist)),
index=np.repeat(df.index.values, df.pop('dist').str.len())).rename('dist')
print (s)
0 16015
0 5678
1 5267
Name: dist, dtype: int64
print (df.join(s).reset_index(drop=True))
poly total inside outside dist
0 1000 2 0 2 16015
1 1000 2 0 2 5678
2 1100 1 0 1 5267
MultiIndex
的另一个解决方案:
names = ['poly','total', 'inside','outside']
df = df.set_index(names)
mux = pd.MultiIndex.from_tuples(np.repeat(df.index.values, df.dist.str.len()), names=names)
df2 = pd.DataFrame({'dist':list(chain.from_iterable(df.dist))}, index=mux).reset_index()
print (df2)
poly total inside outside dist
0 1000 2 0 2 16015
1 1000 2 0 2 5678
2 1100 1 0 1 5267
答案 2 :(得分:0)
我有一个解决方案..
df2=df.fillna(0)
s=df2.apply(lambda x:pd.Series(x['Distances']),axis=1).stack().reset_index(level=1, drop=True)
s.name="Distance"
df3=df2.drop("Distances",axis=1).join(s)
32811 253 221 32 20
32811 253 221 32 3015
32811 253 221 32 2010
感谢您的帮助..如果您能为我提供错误解决方案,我将不胜感激:'ValueError:long()的无效文字,基数为10:'[16015,5678]''。