我有以下数据。
a <- structure(list(Title = c("AAADE", "BBBCF", "NBNJHB", "TTTTT", "VVVFF",
"AASFE", "DDDFFF", "ERFRR", "AAAAAA", "ERERE"),
Year = c("2004", "2004", "2004", "2004", "2004", "2004", "2005", "2005", "2005", "2005")),
.Names = c("Title", "Year"), row.names = c(NA, -10L), class = "data.frame")
a
Title Year
1 AAADE 2004
2 BBBCF 2004
3 NBNJHB 2004
4 TTTTT 2004
5 VVVFF 2004
6 AASFE 2004
7 DDDFFF 2005
8 ERFRR 2005
9 AAAAAA 2005
10 ERERE 2005
我想基于同一年连接行。我正在尝试使用'tm'包函数,这些函数无法帮助我得到以下结果。
Title Year
AAADE BBBCF NBNJHB TTTTT VVVFF AASFE 2004
DDDFFF ERFRR AAAAAA ERERE 2005
答案 0 :(得分:3)
更直接的方法是使用aggregate
:
aggregate(Title ~ Year, a, paste, collapse = " ")
# Year Title
# 1 2004 AAADE BBBCF NBNJHB TTTTT VVVFF AASFE
# 2 2005 DDDFFF ERFRR AAAAAA ERERE
如果列的顺序对您很重要,您可以aggregate(Title ~ Year, a, paste, collapse = " ")[names(a)]
。
从aggregate
加强,您可以查看&#34; data.table&#34;和&#34; dplyr&#34;,这两者对于更大的数据集都会更有效。
这里&#34; dplyr&#34;:
library(dplyr)
a %>% group_by(Year) %>% summarise(Title = paste(Title, collapse = " "))
# Source: local data frame [2 x 2]
#
# Year Title
# 1 2004 AAADE BBBCF NBNJHB TTTTT VVVFF AASFE
# 2 2005 DDDFFF ERFRR AAAAAA ERERE
这里&#34; data.table&#34;:
library(data.table)
A <- as.data.table(a)
A[, list(Title = paste(Title, collapse = " ")), by = Year]
# Year Title
# 1: 2004 AAADE BBBCF NBNJHB TTTTT VVVFF AASFE
# 2: 2005 DDDFFF ERFRR AAAAAA ERERE
答案 1 :(得分:2)
with(a, data.frame(Title = tapply(Title, Year, paste, collapse = ' '), Year = unique(Year)))
结果:
Title Year
AAADE BBBCF NBNJHB TTTTT VVVFF AASFE 2004
DDDFFF ERFRR AAAAAA ERERE 2005