Question

我有csv格式的数据。

数据格式如下。将Receipt nos放在一列中，将Product放在相应的列中

Receipt_no Product
A1  Apple
A1  Banana
A1  Orange
A2  Pineapple
A2  Jackfruit
A3  Cola
A3  Tea

我想将它们重新排列为

A1 ,  Apple, Banana, Orange
A2 , Pineapple, Jackfruit
A3 , Cola, Tea

这是以逗号分隔的一行中的收据编号和产品名称。由于数据很大，我想在R中重新排列相同的内容。

请帮助

感谢。

此致 Nithish

Answer 1

基地R，

                          Here: ▼▼▼▼
^(.*?)\s*(?:\(((?:19|20)\d\d)\)|[:.])[\s:]*(.*?[?.!])\s*([\w\s]+?)\.?\s*(?:((?:19|20)\d\d)(?:\s+\w+)?)?[.;\s]*(\d+)\s*(?:\(\d+\))?[,:\s]+(\d+(?:-\d+)?)[^\d]*$

使用aggregate(Product ~ Receipt_no, df, paste, collapse = ',')，

dplyr

Answer 2

使用基数R：

u <- as.vector(unique(df$Receipt_no))
as.list(sapply(u, function(x) paste0(x, ", ", paste0(subset(df$Product, df$Receipt_no==x), collapse = ", "))))

# $A1
# [1] "A1, Apple, Banana, Orange"

# $A2
# [1] "A2, Pineapple, Jackfruit"

# $A3
# [1] "A3, Cola, Tea"

数据

df <- structure(list(Receipt_no = structure(c(1L, 1L, 1L, 2L, 2L, 3L, 3L), .Label = c("A1", "A2", "A3"), class = "factor"), Product = structure(c(1L, 2L, 5L, 6L, 4L, 3L, 7L), .Label = c("Apple", "Banana", "Cola", "Jackfruit", "Orange", "Pineapple", "Tea"), class = "factor")), .Names = c("Receipt_no", "Product"), class = "data.frame", row.names = c(NA, -7L))

重新安排R中的数据以进行市场购物篮分析

2 个答案: