行中列表的元素

时间:2017-03-29 04:20:10

标签: python csv pandas

我的多边形很少,并且这些多边形的某些点距离很远。我试图用pandas写一个csv,每个点和多边形之间的距离将分开。我明白了:

poly total inside outside dist
1000   2     0      2     [16015,5678]
1100   1     0      1     [5267]

我想得到:

poly total inside outside dist
1000   2    0       2     16015
1000   2    0       2     5678
1100   1    0       1     5267

在查看之前的q [How to write nth value of list into csv file

后,我尝试了以下操作
distance =[]
for row in arcpy.da.SearchCursor(outSide, ["SHAPE@XY"]):
            px, py = row[0]
            zipPoint=point(px,py)
            Distance.append(int(zipPoint.calDist(PolyCenter)))
for i in distance:
    df.loc[polygon,"distance"]=distance
    df.loc[zipCode,"Total"]=count
    df.loc[zipCode,"Inside"]=insideNum
    df.loc[zipCode,"Outside"]=outsideNum

但是它在csv中给了我相同的结果。任何帮助表示赞赏。

3 个答案:

答案 0 :(得分:1)

创建数据框:

import pandas as pd
import numpy as np

df= pd.DataFrame({
        'poly':[1000,1100],
        'total':[2,1],
        'inside':[0,0],
        'outside':[2,1],
        'dist':[[16015,5678],[5267]]
        })

df = df[['poly','total','inside','outside','dist']]

df
Out[]: 
   poly  total  inside  outside           dist
0  1000      2       0        2  [16015, 5678]
1  1100      1       0        1         [5267]

<强>处理

 new_df = pd.DataFrame({
         col:np.repeat(df[col].values, df['dist'].str.len())
         for col in df.columns.difference(['dist'])
     }).assign(**{'dist':np.concatenate(df['dist'].values)})[df.columns.tolist()]


new_df
Out[]: 
   poly  total  inside  outside   dist
0  1000      2       0        2  16015
1  1000      2       0        2   5678
2  1100      1       0        1   5267

答案 1 :(得分:1)

您可以使用str.len获取lists的长度,numpy.repeat使用flattening lists然后join原始列重复这些长度:

from  itertools import chain

s = pd.Series(list(chain.from_iterable(df.dist)),
                   index=np.repeat(df.index.values, df.pop('dist').str.len())).rename('dist')
print (s)
0    16015
0     5678
1     5267
Name: dist, dtype: int64

print (df.join(s).reset_index(drop=True))
   poly  total  inside  outside   dist
0  1000      2       0        2  16015
1  1000      2       0        2   5678
2  1100      1       0        1   5267

MultiIndex的另一个解决方案:

names = ['poly','total', 'inside','outside']
df = df.set_index(names)
mux = pd.MultiIndex.from_tuples(np.repeat(df.index.values, df.dist.str.len()), names=names)
df2 = pd.DataFrame({'dist':list(chain.from_iterable(df.dist))}, index=mux).reset_index()
print (df2)
   poly  total  inside  outside   dist
0  1000      2       0        2  16015
1  1000      2       0        2   5678
2  1100      1       0        1   5267

答案 2 :(得分:0)

我有一个解决方案..

df2=df.fillna(0) s=df2.apply(lambda x:pd.Series(x['Distances']),axis=1).stack().reset_index(level=1, drop=True) s.name="Distance" df3=df2.drop("Distances",axis=1).join(s)

它看起来像:     在这里输入代码

32811   253  221  32  20
32811  253  221  32  3015
32811   253  221  32  2010

感谢您的帮助..如果您能为我提供错误解决方案,我将不胜感激:'ValueError:long()的无效文字,基数为10:'[16015,5678]''。