如何按特定列值将多行组合到一个矢量行中。 例如,我有这些数据:
| Latitude | Longitude | group |
|-----------|------------|-------|
| 46.852397 | -72.02586 | A |
| 47.059016 | -70.907962 | A |
| 46.897785 | -71.140082 | A |
| 46.99328 | -70.986152 | A |
| 46.64613 | -71.934034 | A |
| 46.622638 | -71.994857 | A |
| 46.968093 | -71.284281 | B |
| 47.422739 | -70.32361 | B |
| 46.878963 | -71.717918 | B |
| 46.91002 | -71.108395 | C |
| 47.465175 | -70.337958 | C |
| 46.6936 | -71.862257 | C |
| 47.40885 | -70.390739 | C |
| 47.00737 | -71.232117 | C |
| 47.013901 | -70.965815 | C |
| 46.824111 | -71.554997 | C |
| 47.003765 | -71.193865 | C |
| 46.665319 | -72.15102 | C |
| 47.129865 | -70.842406 | C |
| 46.932361 | -71.994677 | C |
我想转换成这个:
| group | Latitude | Longitude |
|-------|-------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|
| A | [46.852397,47.059016,46.897785,46.99328,46.64613,46.622638] | [-72.02586,-70.907962,-71.140082,-70.986152,-71.934034,-71.994857] |
| B | [46.968093,47.422739,46.878963] | [-71.284281,-70.32361,-71.717918] |
| C | [46.91002,47.465175,46.6936,47.40885,47.00737,47.013901,46.824111,47.003765,46.665319,,47.129865,46.932361] | [-71.108395,-70.337958,-71.862257,-70.390739,-71.232117,-70.965815,-71.554997,-71.193865,-72.15102,-70.842406,-71.994677] |
答案 0 :(得分:4)
假设您有一个如下所示的数据框:
>>> df
v1 v2 v3
0 1 2 a
1 3 4 a
2 1 2 b
3 3 4 b
然后,你可以拥有你想要的东西:
>>> df.groupby('v3').agg(lambda m: list(m)).reset_index()
v3 v1 v2
0 a [1, 3] [2, 4]
1 b [1, 3] [2, 4]
然而,这是一个坏主意因为Pandas不能很好地处理列表作为值。它不是为此而设计的。但是,如果它适合您,请继续使用它。