我一直在寻找解决方案并尝试多种方法来实现我想要的效果,但无济于事!我真的很感激一些帮助。
我有几张表格,其中包含不同国家/地区的数据。我需要按国家/地区合并这些表,但在每个表中,同一个国家/地区的引用方式通常不同,因此我需要先将它们标准化。
示例表1:
birth_country mean_age
China 37
Germany 42
示例表2:
birth_country proportion_male
Federal Republic of Germany 54
China, People's Republic of 43
所以我想做这样的事情(当我按照以下方式对单个表进行操作时):
table1$birth_country[table1$birth_country == "China"] <- "China, People\'s Republic of"
table1$birth_country[table1$birth_country == "Federal Republic of Germany"] <- "Germany"
但不管我尝试什么,我似乎无法将这种过程应用到我的所有表格中。我已尝试lapply
和for
循环,至少有以下十种变体......:
standardizeCountryNames<-function(x){
x[x == "China"] <- "China, People\'s Republic of"
x[x == "Federal Republic of Germany"] <- "Germany"
}
tables<-list(table1, table2, table3)
lapply(tables, function(i) {standardizeCountryNames(i$birth_country)})
和
for (k in 1:length(tables)){
tables[[k]]$birth_country[tables[[k]]$birth_country == "China"] <- "China, People\'s Republic of" }
我尝试过以不同方式引用birth_country变量,例如使用with(table)
和attach(table)
。
任何帮助将不胜感激! (:
答案 0 :(得分:1)
你快到了那里:
table1 <- read.table(
text = "birth_country mean_age
China 37
Germany 42",
header = TRUE, stringsAsFactors = FALSE)
table2 <- read.table(
text = 'birth_country proportion_male
"Federal Republic of Germany" 54
"China, People\'s Republic of" 43',
header = TRUE, stringsAsFactors = FALSE)
standardizeCountryNames<-function(x){
x$birth_country[x$birth_country == "China"] <- "China, People\'s Republic of"
x$birth_country[x$birth_country == "Federal Republic of Germany"] <- "Germany"
x
}
tables<-list(table1, table2)
lapply(tables, function(i) {standardizeCountryNames(i)})
# [[1]]
# birth_country mean_age
# 1 China, People's Republic of 37
# 2 Germany 42
#
# [[2]]
# birth_country proportion_male
# 1 Germany 54
# 2 China, People's Republic of 43