我的数据集中有一个如下所示的变量:
IBM, Oracle, Ping
IBM, Ping
HP, IBM, Nagios
Solarwinds, HP, Nagios
BMC, Solarwinds, HP, IBM, Nagios, SCOM
我想将这些公司分开,并为每个公司创建新的变量。例如,我希望IBM,Nagiog,SCOM等有一个不同的变量。我该怎么做呢?
答案 0 :(得分:0)
我对这个问题的解读是:
data.frame
中有一个列(比如名为"公司"),其中有一个逗号分隔的公司字符串。 如果我的阅读正确,请尝试{" splitstackshape"}中的cSplit_e
包:
cSplit_e(mydf, "companies", ",", type = "character", mode = "binary", fill = 0)
# companies companies_BMC companies_HP companies_IBM
# 1 IBM, Oracle, Ping 0 0 1
# 2 IBM, Ping 0 0 1
# 3 HP, IBM, Nagios 0 1 1
# 4 Solarwinds, HP, Nagios 0 1 0
# 5 BMC, Solarwinds, HP, IBM, Nagios, SCOM 1 1 1
# companies_Nagios companies_Oracle companies_Ping companies_SCOM companies_Solarwinds
# 1 0 1 1 0 0
# 2 0 0 1 0 0
# 3 1 0 0 0 0
# 4 1 0 0 0 1
# 5 1 0 0 1 1
这假设我们开始的数据是:
mydf <- data.frame(
companies = c("IBM, Oracle, Ping",
"IBM, Ping",
"HP, IBM, Nagios",
"Solarwinds, HP, Nagios",
"BMC, Solarwinds, HP, IBM, Nagios, SCOM"))
还有一个drop
参数,如果要删除原始列,可以将其设置为TRUE
。