我有一个如下所示的数据框:
> example
Country Urban
1 Afghanistan 0
2 Albania 40
3 Algeria 50
4 Andorra 50
5 Angola 60
6 Antigua and Barbuda 32
7 Argentina 60
8 Armenia 90
9 Australia 50
10 Austria 50
11 Azerbaijan 60
12 Bahrain 60
13 Bangladesh 0
14 Barbados 80
15 Belarus 60
16 Belgium 50
17 Belize 40
18 Benin
19 Bhutan 30
20 Bolivia (Plurinational State of) 40
我想将数字刻度(0-49)归类为2.因此,在删除空白行后,我尝试了:
example <- as.data.frame(sapply(example, gsub, pattern = c(0:49), replacement = 2))
它不起作用。
以下是使用dput生成的可重现样本:
structure(list(Country = structure(1:20, .Label = c("Afghanistan",
"Albania", "Algeria", "Andorra", "Angola", "Antigua and Barbuda",
"Argentina", "Armenia", "Australia", "Austria", "Azerbaijan",
"Bahrain", "Bangladesh", "Barbados", "Belarus", "Belgium", "Belize",
"Benin", "Bhutan", "Bolivia (Plurinational State of)"), class = "factor"),
Urban = structure(c(2L, 7L, 12L, 12L, 14L, 5L, 14L, 19L,
12L, 12L, 14L, 14L, 2L, 18L, 14L, 12L, 8L, 1L, 4L, 7L), .Label = c("",
"0", "100", "30", "32", "35", "40", "40 ", "45", "48", "48 ",
"50", "56 ", "60", "64 ", "65", "70", "80", "90"), class = "factor")), .Names = c("Country",
"Urban"), row.names = c(NA, 20L), class = "data.frame")
答案 0 :(得分:0)
使用ifelse
:
df$Urban = with(df, ifelse(Urban > 0 & Urban < 49, 2, Urban))
<强>结果:强>
> df
Country Urban
1 Afghanistan 0
2 Albania 2
3 Algeria 50
4 Andorra 50
5 Angola 60
6 Antigua and Barbuda 2
7 Argentina 60
8 Armenia 90
9 Australia 50
10 Austria 50
11 Azerbaijan 60
12 Bahrain 60
13 Bangladesh 0
14 Barbados 80
15 Belarus 60
16 Belgium 50
17 Belize 2
18 Benin NA
19 Bhutan 2
20 Bolivia (Plurinational State of) 2
数据:强>
df = structure(list(Country = structure(1:20, .Label = c("Afghanistan",
"Albania", "Algeria", "Andorra", "Angola", "Antigua and Barbuda",
"Argentina", "Armenia", "Australia", "Austria", "Azerbaijan",
"Bahrain", "Bangladesh", "Barbados", "Belarus", "Belgium", "Belize",
"Benin", "Bhutan", "Bolivia (Plurinational State of)"), class = "factor"),
Urban = structure(c(2L, 7L, 12L, 12L, 14L, 5L, 14L, 19L,
12L, 12L, 14L, 14L, 2L, 18L, 14L, 12L, 8L, 1L, 4L, 7L), .Label = c("",
"0", "100", "30", "32", "35", "40", "40 ", "45", "48", "48 ",
"50", "56 ", "60", "64 ", "65", "70", "80", "90"), class = "factor")), .Names = c("Country",
"Urban"), row.names = c(NA, 20L), class = "data.frame")
df$Urban = as.numeric(as.character(df$Urban))