我得到了一个包含关键字列表的数据集(1个关键字/行)。
像这样:
| KEYWORD | ALPHABETICAL |
| house blue | blue house |
| blue house | blue house |
| my blue house | blue house my |
| this house is blue | blue house is this |
| sky orange | orange sky |
| orange sky | orange sky |
| the orange sky | orange sky the |
感谢您的帮助!
答案 0 :(得分:6)
迭代行以" "
(strsplit
)拆分,排序并折叠回来:
# Generate data
df <- data.frame(KEYWORD = c(paste(sample(letters, 3), collapse = " "),
paste(sample(letters, 3), collapse = " ")))
# KEYWORD
# z e s
# d a u
df$ALPHABETICAL <- apply(df, 1, function(x) paste(sort(unlist(strsplit(x, " "))),
collapse = " "))
# KEYWORD ALPHABETICAL
# z e s e s z
# d a u a d u
答案 1 :(得分:1)
df$ALPHABETICAL <- sapply(strsplit(df$KEYWORD," "),function(x) paste(sort(x),collapse=" "))
df
# KEYWORD ALPHABETICAL
# 1 house blue blue house
# 2 blue house blue house
# 3 my blue house blue house my
# 4 this house is blue blue house is this
# 5 sky orange orange sky
# 6 orange sky orange sky
# 7 the orange sky orange sky the
数据强>
df <- data.frame(KEYWORD = c(
'house blue',
'blue house',
'my blue house',
'this house is blue',
'sky orange',
'orange sky',
'the orange sky'),stringsAsFactors = FALSE)
答案 2 :(得分:0)
使用dplyr + stringr的一个解决方案
library(dplyr)
library(stringr)
KEYWORDS <- c('house blue','blue house','my blue house','this house is blue','sky orange','orange sky','the orange sky')
ALPHABETICAL <- KEYWORDS %>% str_split(., ' ') %>% lapply(., 'sort') %>% lapply(., 'paste', collapse=' ') %>% unlist(.)
最后一行使用str_split()将KEYWORDS拆分为向量列表;然后将sort应用于每个列表元素;使用paste连接向量,最后将列表分解为向量。
结果是
> cbind(KEYWORDS, ALPHABETICAL)
KEYWORDS ALPHABETICAL
[1,] "house blue" "blue house"
[2,] "blue house" "blue house"
[3,] "my blue house" "blue house my"
[4,] "this house is blue" "blue house is this"
[5,] "sky orange" "orange sky"
[6,] "orange sky" "orange sky"
[7,] "the orange sky" "orange sky the"