Question

我有一个问题。

我有一张这样的桌子

TAC | Latitude | Longitude
1 | 50.4 | -1.5

在熊猫里，我想说：

对于每个TAC，请给我一个经纬度的压缩列表（每个TAC可以有很多行）。

我已经尝试过以下操作，但是我做错了！你能帮忙吗？

df1['coordinates'] = list(zip(df1.Lat, df1.Long))
new_df = df1.iloc[ : , : ].groupby('TAC').agg(df1['coordinates'])

作为参考，DF1的创建如下

df = pd.read_csv('tacs.csv')
df1 = df[['magnet.tac','magnet.latitude', 'magnet.longitude']]
df1.columns = ['TAC','Lat','Long']

Answer 1

首先为避免SettingWithCopyWarning添加usecols参数，然后将GroupBy.apply与lambda函数一起使用：

df = pd.read_csv('tacs.csv', usecols=['magnet.tac','magnet.latitude', 'magnet.longitude'])
df1.columns = ['TAC','Lat','Long']

#sample data
print (df1)
   TAC   Lat  Long
0    1  50.4  -1.5
1    1  50.1  -1.4
2    2  50.2  -1.8
3    2  50.9  -1.3

new_df = df1.groupby('TAC').apply(lambda x: list(zip(x.Lat, x.Long))).reset_index(name='coord')
print (new_df)
   TAC                         coord
0    1  [(50.4, -1.5), (50.1, -1.4)]
1    2  [(50.2, -1.8), (50.9, -1.3)]

您的解决方案应更改：

df = pd.read_csv('tacs.csv')
df1 = df[['magnet.tac','magnet.latitude', 'magnet.longitude']].copy()
df1.columns = ['TAC','Lat','Long']

df1['coordinates'] = list(zip(df1.Lat, df1.Long))
new_df = df1.groupby('TAC')['coordinates'].agg(list).reset_index()

对于熊猫的每个循环，每个类别

1 个答案: