Python通过特定列值将多行组合成一个向量行

时间:2017-03-19 16:23:37

标签: python pandas

如何按特定列值将多行组合到一个矢量行中。 例如,我有这些数据:

| Latitude  | Longitude  | group |
|-----------|------------|-------|
| 46.852397 | -72.02586  | A     |
| 47.059016 | -70.907962 | A     |
| 46.897785 | -71.140082 | A     |
| 46.99328  | -70.986152 | A     |
| 46.64613  | -71.934034 | A     |
| 46.622638 | -71.994857 | A     |
| 46.968093 | -71.284281 | B     |
| 47.422739 | -70.32361  | B     |
| 46.878963 | -71.717918 | B     |
| 46.91002  | -71.108395 | C     |
| 47.465175 | -70.337958 | C     |
| 46.6936   | -71.862257 | C     |
| 47.40885  | -70.390739 | C     |
| 47.00737  | -71.232117 | C     |
| 47.013901 | -70.965815 | C     |
| 46.824111 | -71.554997 | C     |
| 47.003765 | -71.193865 | C     |
| 46.665319 | -72.15102  | C     |
| 47.129865 | -70.842406 | C     |
| 46.932361 | -71.994677 | C     |

我想转换成这个:

| group | Latitude                                                                                                    | Longitude                                                                                                                 |
|-------|-------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|
| A     | [46.852397,47.059016,46.897785,46.99328,46.64613,46.622638]                                                 | [-72.02586,-70.907962,-71.140082,-70.986152,-71.934034,-71.994857]                                                        |
| B     | [46.968093,47.422739,46.878963]                                                                             | [-71.284281,-70.32361,-71.717918]                                                                                         |
| C     | [46.91002,47.465175,46.6936,47.40885,47.00737,47.013901,46.824111,47.003765,46.665319,,47.129865,46.932361] | [-71.108395,-70.337958,-71.862257,-70.390739,-71.232117,-70.965815,-71.554997,-71.193865,-72.15102,-70.842406,-71.994677] |

1 个答案:

答案 0 :(得分:4)

假设您有一个如下所示的数据框:

>>> df
   v1  v2 v3
0   1   2  a
1   3   4  a
2   1   2  b
3   3   4  b

然后,你可以拥有你想要的东西:

>>> df.groupby('v3').agg(lambda m: list(m)).reset_index()
  v3      v1      v2
0  a  [1, 3]  [2, 4]
1  b  [1, 3]  [2, 4]

然而,这是一个坏主意因为Pandas不能很好地处理列表作为值。它不是为此而设计的。但是,如果它适合您,请继续使用它。