如何根据客户ID组合不同的单元格?

时间:2016-06-01 19:09:35

标签: python r excel

我有一个交易数据集,我想根据客户ID对其进行转换。样品如下。

CustomerID    Description

17850         WHITE HANGING HEART T-LIGHT HOLDER
17850         WHITE METAL LANTERN
13047         ASSORTED COLOUR BIRD ORNAMENT
13047         POPPY'S PLAYHOUSE BEDROOM
13047         POPPY'S PLAYHOUSE KITCHEN

我希望这个数据集按以下顺序排列: -

17850         WHITE HANGING HEART T-LIGHT HOLDER, WHITE METAL LANTERN
13047         ASSORTED COLOUR BIRD ORNAMENT,POPPY'S PLAYHOUSE BEDROOM, POPPY'S PLAYHOUSE KITCHEN

数据集采用csv格式,每个值都在单独的单元格中。 任何人都可以建议任何方法在excel或R或python中执行此操作吗?

3 个答案:

答案 0 :(得分:0)

在Python中,您可以使用pandas

安装它,然后尝试

import pandas as pd

# Read the cvs file
df = pd.read_csv('yourFileName.csv')

# Group by CustomerID and join Descriptions with commas
df.groupby('CustomerID')['Description'].apply(','.join)

# Save the result in cvs file
df.to_csv('resultFileName.csv', index=False)

答案 1 :(得分:0)

您可以使用aggregate()功能,创建我自己的数据,您可以为上面的数据框执行此操作。根据{{​​1}}号码,Customer被连接

Texts

答案 2 :(得分:0)

其他方法包括使用plyrdata.table。 data.table可能更高效,更简单,并提供控制。

library(plyr)
ddply(df, .(ID), summarize, Text = paste(Text, collapse = ","))

require(DT)
DT <- data.table(df)
# group the table by ID and then add a new column by pasting the list 
# of values in each group together. 

DT[, list(Text = paste(Text, collapse = ",")), by = ID]

   ID                                                                                Text
1: 17850                              WHITE HANGING HEART T-LIGHT HOLDER,WHITE METAL LANTERN
2: 13047  ASSORTED COLOUR BIRD ORNAMENT,POPPY'S PLAYHOUSE BEDROOM, POPPY'S PLAYHOUSE KITCHEN

数据

df <- data.frame(ID = c(17850,17850,13047,13047,13047),
      Text = c("WHITE HANGING HEART T-LIGHT HOLDER","WHITE METAL LANTERN",
              " ASSORTED COLOUR BIRD ORNAMENT","POPPY'S PLAYHOUSE BEDROOM",
              " POPPY'S PLAYHOUSE KITCHEN"))