Question

列表项

我有需要按如下方式分组的数据我正在努力将以下数据分组。

V1   | V2  | V3   | V4  |V5   | V6
----------------------------------
P    | PNR |Model | abd |SUB  | 2
----------------------------------
Model| abc | SUB  |  1  |Place| C
----------------------------------
Model| abc |SUB   |  1  |Place| C
----------------------------------
P    | PNR |Model | abc |SUB  | 1
----------------------------------

以上数据应按以下方式分组：

P   |Model  |SUB|Place
-----------------------
PNR |abd    |2  |
-----------------------
    |abc    |1  |C
-----------------------
    |abc    |1  |C
-----------------------
PNR |abc    |1  |
-----------------------

任何人都可以帮我解决上述问题或：以下是关联规则的strsplit_fixed。有没有办法按预期获得上述数据。会非常有帮助

Answer 1

可能有一种更有效的方法，但是这里有一个选项，可以使用for循环根据匹配索引填充值。

# Data Preparation
dt <- data.frame(V1 = c("P", "Model", "Model", "P"),
                 V2 = c("PNR", "abc", "abc", "PNR"),
                 V3 = c("Model", "SUB", "SUB", "Model"),
                 V4 = c("abd", "1", "1", "abc"),
                 V5 = c("SUB", "Place", "Place", "SUB"),
                 V6 = c("2", "C", "C", "1"),
                 stringsAsFactors = FALSE)

# Get the column names
cols <- unique(c(dt$V1, dt$V3, dt$V5))

# Get the match indices in V1, V3, V5
m_index <- t(apply(dt[, c("V1", "V3", "V5")], 1, function(x) which(cols %in% x)))

# Create an empty matrix
m_fill <- matrix(NA, nrow = 4, ncol = 4)

# Subsett the original data frame to only keep V2, V4, V6
dt2 <- dt[, c("V2", "V4", "V6")]

# Fill the empty matrix based on the indices in m_index
for (i in 1:nrow(dt2)){
  for (j in 1:ncol(dt2)){
    m_fill[i, m_index[i, j]] <- dt2[i, j]
  }
}

# Convert m_fill to a data frame
dt3 <- as.data.frame(m_fill)
colnames(dt3) <- cols

dt3
     P Model SUB Place
1  PNR   abd   2  <NA>
2 <NA>   abc   1     C
3 <NA>   abc   1     C
4  PNR   abc   1  <NA>

R中的列分组，数据

列表项

1 个答案: