如何在R中的同一列中将四个因子组合成两个?

时间:2014-02-04 18:04:15

标签: r command statistics

我有一组包含性别的数据。但我不是“女性”和“男性”,而是“女性”,“男性”,“男性”和“男性”4类。我试图分别用“女性”和“男性”替换所有“f”和“m”。

我在这里使用什么命令?

数据如下所示:

dt <- data.frame(...)

     Gender     Age    
1    female     24          
2         m     38      
3    female     29      
4         m     33      
5         m     49      
6         f     29      
7         f     26      
8         f     36      
9    female     58      
10        f     31      
11   female     31      
12        f     29      
13   female     19      
14     male     38      
15   female     63      

我尝试了这段代码:

dt$Gender <- dt$Gender(c("female","female","male","male"))  

但它说错误。

2 个答案:

答案 0 :(得分:1)

由于您在标题中提到factor,您是否看过factor函数?

x <- c("female", "f", "male", "m", "f", "undeclared")
y <- factor(x)
y
# [1] female     f          male       m          f          undeclared
# Levels: f female m male undeclared
levels(y) <- list("female" = c("female", "f"),
                  "male" = c("male", "m"),
                  "undeclared" = "undeclared")
y
# [1] female     female     male       male       female     undeclared
# Levels: female male undeclared

答案 1 :(得分:0)

> #Use first 2 and last 4 cases in your data to demonstrate
> tt <- data.frame(Gender = as.factor(c("female", "m", "f", "female", "male", "female")), Age = as.numeric(c(24,38,29,19,38,63)))
> tt
  Gender Age
1 female  24
2      m  38
3      f  29
4 female  19
5   male  38
6 female  63

> str(tt)   # Check the structure of the data, make sure Gender is a factor
'data.frame':   6 obs. of  2 variables:
 $ Gender: Factor w/ 4 levels "f","female","m",..: 2 3 1 2 4 2
 $ Age   : num  24 38 29 19 38 63

> levels(tt$Gender)   # Show the levels of factor "Gender"
[1] "f"      "female" "m"      "male"  

> levels(tt$Gender) <- c("female","female","male","male")  
  # Assign new levels you want -- make sure they are in the same order as the old ones

> levels(tt$Gender)   # Now the identical levels are combined 
[1] "female" "male" 

**如果Gender不是因素,您可以使用as.factor()更改变量类。