将数字比例转换为类别

时间:2017-10-13 16:55:39

标签: r

我有一个如下所示的数据框:

> example
                            Country Urban
1                       Afghanistan     0
2                           Albania    40
3                           Algeria    50
4                           Andorra    50
5                            Angola    60
6               Antigua and Barbuda    32
7                         Argentina    60
8                           Armenia    90
9                         Australia    50
10                          Austria    50
11                       Azerbaijan    60
12                          Bahrain    60
13                       Bangladesh     0
14                         Barbados    80
15                          Belarus    60
16                          Belgium    50
17                           Belize    40  
18                            Benin      
19                           Bhutan    30
20 Bolivia (Plurinational State of)    40

我想将数字刻度(0-49)归类为2.因此,在删除空白行后,我尝试了:

example <- as.data.frame(sapply(example, gsub, pattern = c(0:49), replacement = 2))

它不起作用。

以下是使用dput生成的可重现样本:

structure(list(Country = structure(1:20, .Label = c("Afghanistan", 
"Albania", "Algeria", "Andorra", "Angola", "Antigua and Barbuda", 
"Argentina", "Armenia", "Australia", "Austria", "Azerbaijan", 
"Bahrain", "Bangladesh", "Barbados", "Belarus", "Belgium", "Belize", 
"Benin", "Bhutan", "Bolivia (Plurinational State of)"), class = "factor"), 
    Urban = structure(c(2L, 7L, 12L, 12L, 14L, 5L, 14L, 19L, 
    12L, 12L, 14L, 14L, 2L, 18L, 14L, 12L, 8L, 1L, 4L, 7L), .Label = c("", 
    "0", "100", "30", "32", "35", "40", "40  ", "45", "48", "48  ", 
    "50", "56  ", "60", "64  ", "65", "70", "80", "90"), class = "factor")), .Names = c("Country", 
"Urban"), row.names = c(NA, 20L), class = "data.frame")

1 个答案:

答案 0 :(得分:0)

使用ifelse

df$Urban = with(df, ifelse(Urban > 0 & Urban < 49, 2, Urban))

<强>结果:

> df
                            Country Urban
1                       Afghanistan     0
2                           Albania     2
3                           Algeria    50
4                           Andorra    50
5                            Angola    60
6               Antigua and Barbuda     2
7                         Argentina    60
8                           Armenia    90
9                         Australia    50
10                          Austria    50
11                       Azerbaijan    60
12                          Bahrain    60
13                       Bangladesh     0
14                         Barbados    80
15                          Belarus    60
16                          Belgium    50
17                           Belize     2
18                            Benin    NA
19                           Bhutan     2
20 Bolivia (Plurinational State of)     2

数据:

df = structure(list(Country = structure(1:20, .Label = c("Afghanistan", 
                                                         "Albania", "Algeria", "Andorra", "Angola", "Antigua and Barbuda", 
                                                         "Argentina", "Armenia", "Australia", "Austria", "Azerbaijan", 
                                                         "Bahrain", "Bangladesh", "Barbados", "Belarus", "Belgium", "Belize", 
                                                         "Benin", "Bhutan", "Bolivia (Plurinational State of)"), class = "factor"), 
                    Urban = structure(c(2L, 7L, 12L, 12L, 14L, 5L, 14L, 19L, 
                                        12L, 12L, 14L, 14L, 2L, 18L, 14L, 12L, 8L, 1L, 4L, 7L), .Label = c("", 
                                                                                                           "0", "100", "30", "32", "35", "40", "40  ", "45", "48", "48  ", 
                                                                                                           "50", "56  ", "60", "64  ", "65", "70", "80", "90"), class = "factor")), .Names = c("Country", 
                                                                                                                                                                                               "Urban"), row.names = c(NA, 20L), class = "data.frame")

df$Urban = as.numeric(as.character(df$Urban))