我有需要按如下方式分组的数据我正在努力将以下数据分组。
V1 | V2 | V3 | V4 |V5 | V6
----------------------------------
P | PNR |Model | abd |SUB | 2
----------------------------------
Model| abc | SUB | 1 |Place| C
----------------------------------
Model| abc |SUB | 1 |Place| C
----------------------------------
P | PNR |Model | abc |SUB | 1
----------------------------------
以上数据应按以下方式分组:
P |Model |SUB|Place
-----------------------
PNR |abd |2 |
-----------------------
|abc |1 |C
-----------------------
|abc |1 |C
-----------------------
PNR |abc |1 |
-----------------------
任何人都可以帮我解决上述问题或:以下是关联规则的strsplit_fixed。有没有办法按预期获得上述数据。 会非常有帮助
答案 0 :(得分:0)
可能有一种更有效的方法,但是这里有一个选项,可以使用for循环根据匹配索引填充值。
# Data Preparation
dt <- data.frame(V1 = c("P", "Model", "Model", "P"),
V2 = c("PNR", "abc", "abc", "PNR"),
V3 = c("Model", "SUB", "SUB", "Model"),
V4 = c("abd", "1", "1", "abc"),
V5 = c("SUB", "Place", "Place", "SUB"),
V6 = c("2", "C", "C", "1"),
stringsAsFactors = FALSE)
# Get the column names
cols <- unique(c(dt$V1, dt$V3, dt$V5))
# Get the match indices in V1, V3, V5
m_index <- t(apply(dt[, c("V1", "V3", "V5")], 1, function(x) which(cols %in% x)))
# Create an empty matrix
m_fill <- matrix(NA, nrow = 4, ncol = 4)
# Subsett the original data frame to only keep V2, V4, V6
dt2 <- dt[, c("V2", "V4", "V6")]
# Fill the empty matrix based on the indices in m_index
for (i in 1:nrow(dt2)){
for (j in 1:ncol(dt2)){
m_fill[i, m_index[i, j]] <- dt2[i, j]
}
}
# Convert m_fill to a data frame
dt3 <- as.data.frame(m_fill)
colnames(dt3) <- cols
dt3
P Model SUB Place
1 PNR abd 2 <NA>
2 <NA> abc 1 C
3 <NA> abc 1 C
4 PNR abc 1 <NA>