我有一个R数据框,我从一些调查问卷数据中导入了一个CSV文件。
我的一个专栏名为“NewsMethods”,受访者被要求列出他们获取新闻的方法。我的数据集中的数据如下所示:
广播;电视;新闻网站(如BBC新闻);社交媒体网站或应用程序;口口相传
广播;电视;新闻网站(如BBC新闻);社交媒体网站或应用程序;口口相传
广播;电视;社交媒体网站或应用;口口相传 电视;社交媒体网站或应用程序
......等等。
我希望能够用它包含的元素数替换每一列。例如,我想用数字5替换第一个列表。
如果有人对我如何做到这一点有任何想法,我将非常感激。 TIA
Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
Radio;Television;Social Media websites or apps;Word of mouth
Television;Social Media websites or apps
Newspaper;Radio;Television;News websites (such as BBC News)
Television
Radio;Television;Word of mouth
Television;Social Media websites or apps;Word of mouth
Television;Word of mouth
Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth
我希望这会改为: 五 五 4 2 4 1 3 3 2 6
答案 0 :(得分:2)
我们可以使用str_count
stringr
library(stringr)
df1$Count <- str_count(df1$NewsMethods, ";")+1
df1$Count
#[1] 5 5 4 2 4 1 3 3 2 6
或base R
选项与regexpr
lengths(lapply(gregexpr(";", df1$NewsMethods), function(x) x[x>0]) )+1
#[1] 5 5 4 2 4 1 3 3 2 6
df1 <- structure(list(NewsMethods = c('Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;Social Media websites or apps;Word of mouth',
'Television;Social Media websites or apps',
'Newspaper;Radio;Television;News websites (such as BBC News)',
'Television',
'Radio;Television;Word of mouth',
'Television;Social Media websites or apps;Word of mouth',
'Television;Word of mouth',
'Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth')),
.Names = "NewsMethods", row.names = c(NA, -10L), class = "data.frame")
答案 1 :(得分:2)
具有strsplit
和lengths
lengths(strsplit(dfr$NewsMethods, split = ';'))
给出:
> lengths(strsplit(dfr$NewsMethods, split = ';'))
[1] 5 5 4 2 4 1 3 3 2 6
将结果分配到数据框中的count
- 变量:
dfr$count <- lengths(strsplit(dfr$NewsMethods, split = ';'))
现在您的数据框架如下:
> dfr
NewsMethods count
1 Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth 5
2 Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth 5
3 Radio;Television;Social Media websites or apps;Word of mouth 4
4 Television;Social Media websites or apps 2
5 Newspaper;Radio;Television;News websites (such as BBC News) 4
6 Television 1
7 Radio;Television;Word of mouth 3
8 Television;Social Media websites or apps;Word of mouth 3
9 Television;Word of mouth 2
10 Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth 6
使用过的数据:
dfr <- structure(list(NewsMethods = c('Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth',
'Radio;Television;Social Media websites or apps;Word of mouth',
'Television;Social Media websites or apps',
'Newspaper;Radio;Television;News websites (such as BBC News)',
'Television',
'Radio;Television;Word of mouth',
'Television;Social Media websites or apps;Word of mouth',
'Television;Word of mouth',
'Newspaper;Radio;Television;News websites (such as BBC News);Social Media websites or apps;Word of mouth')),
.Names = "NewsMethods", row.names = c(NA, -10L), class = "data.frame")