R中的价值替代

时间:2013-12-19 23:45:10

标签: r

我是R的新手并试图找出如何更改数据框的现有值。我在下面列出了一个快速概述,感谢您的帮助!!

以下是我目前所拥有的内容:

company      top_ten_advertiser
'Company A'         'Is Top 10'
'Company B'         'Is Top 10'
'Company C'         'Is Top 10'
'Company D'         'Is Top 10'
…              …
'Company X'        'Not Top 10'
'Company Y'        'Not Top 10'
'Company Z'        'Not Top 10'

我想更改为以下内容:

   company      top_ten_advertiser
'Company A'       'Top 10 Company'
'Company B'       'Top 10 Company'
'Company C'       'Top 10 Company'
'Company D'       'Top 10 Company'
…              …
'Company X'   'Not Top 10 Company'
'Company Y'   'Not Top 10 Company'
'Company Z'   'Not Top 10 Company'

2 个答案:

答案 0 :(得分:1)

假设你的数据框叫做df。对字符变量执行此操作:

# Add the word "Company" to all values of top_ten_advertiser
df$top_ten_advertiser = paste(df$top_ten_advertiser, "Company", sep=" ")
# Remove the "Is " from "Is Top 10"
df$top_ten_advertiser = gsub("Is ", "", df$top_ten_advertiser)

或者这是一个因子变量:

# Install the plyr package if you haven't already done so using 
# install.packages("plyr")
library(plyr)
revalue(df$top_ten_advertiser, 
        c("Is Top 10"="Top 10 Company", "Not Top 10"="Not Top 10 Company"))

如果您发现更改因子的级别很痛苦,可以先将因子变量转换为字符,更改值,然后转换回因子,如下所示:

df$top_ten_advertiser = as.character(df$top_ten_advertiser)
df$top_ten_advertiser = paste(df$top_ten_advertiser, "Company", sep=" ")
df$top_ten_advertiser = gsub("Is ", "", df$top_ten_advertiser)
df$top_ten_advertiser = factor(df$top_ten_advertiser)

而且,为了完整起见,马修伦德伯格提到用正则表达式来做这件事:

tst$top_ten_advertiser = gsub("(Is )?(.*)", "\\2 Company", tst$top_ten_advertiser)

它是简约的,但如果你是正则表达式的新手,那就很神秘。这将适用于字符或因子变量。但是,对因子变量执行此操作会将其转换为字符。

答案 1 :(得分:0)

df$top_ten_advertiser <- gsub("Is Top 10", "Top 10 Company", df$top_ten_advertiser)