我有一些民意调查的数据如下:
Freetime_activities
1 Travelling, On the PC, Clubbing
2 Sports, On the PC, Clubbing
3 Clubbing
4 On the PC
5 Travelling, On the PC, Clubbing
6 On the PC
7 Watching TV, Travelling
我想得到每个值的计数(在PC /等等上行/多少次),但是我在分割值时遇到了问题。 R中是否有可以执行的功能:
split("A,B,C") ->
1 A
2 B
3 C
或者是否有直接从列中计算值的直接解决方案?
答案 0 :(得分:5)
我们可以使用strsplit
按分隔符", "
),unlist
list
输出分割列,然后使用table
获取频率< / p>
tbl <- table(unlist(strsplit(as.character(df1$Freetime_activities),
", ")))
as.data.frame(tbl)
# Var1 Freq
#1 Clubbing 4
#2 On the PC 5
#3 Sports 1
#4 Travelling 3
#5 Watching TV 1
注意:如果列为as.character
,则使用factor
,因为strsplit
只能character
个向量。
或者另一种选择是使用scan
提取元素,然后使用table
获取频率。
table(trimws(scan(text = as.character(df1$Freetime_activities),
what = "", sep = ",")))
或read.table
与unlist
和table
table(unlist(read.table(text = as.character(df1$Freetime_activities),
sep = ",", fill = TRUE, strip.white = TRUE)))
编辑:基于@David Arenburg的评论。
df1 <- structure(list(Freetime_activities = c("Travelling, On the PC,
Clubbing",
"Sports, On the PC, Clubbing", "Clubbing", "On the PC", "Travelling,
On the PC, Clubbing",
"On the PC", "Watching TV, Travelling")),
.Names = "Freetime_activities",
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7"))