我这里有世界各国的数据框:
structure(list(long = c(290.100891113281, 290.104309082031, 290.057800292969,
289.995849609375, 289.933868408203, 289.949127197266, 289.964874267578,
290.02685546875, 290.088195800781, 290.100891113281, 74.8913116455078,
74.8402328491211, 74.7673797607422, 74.7389602661133, 74.7266616821289,
74.6689453125, 74.5589904785156, 74.3721694946289, 74.3761672973633,
74.4979553222656), lat = c(12.4520015716553, 12.4229984283447,
12.4385251998901, 12.50048828125, 12.5469722747803, 12.5970697402954,
12.6141109466553, 12.567626953125, 12.48046875, 12.4520015716553,
37.2316398620605, 37.2250518798828, 37.2491722106934, 37.28564453125,
37.2907218933105, 37.2667007446289, 37.2366218566895, 37.15771484375,
37.1373519897461, 37.0572242736816), group = c(1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), order = c(1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 12L, 13L, 14L, 15L, 16L, 17L,
18L, 19L, 20L, 21L), region = c("Aruba", "Aruba", "Aruba", "Aruba",
"Aruba", "Aruba", "Aruba", "Aruba", "Aruba", "Aruba", "Afghanistan",
"Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan",
"Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan"),
subregion = c(NA_character_, NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_,
NA_character_, NA_character_, NA_character_, NA_character_,
NA_character_), population = c(71891L, 71891L, 71891L, 71891L,
71891L, 71891L, 71891L, 71891L, 71891L, 71891L, 31056997L,
31056997L, 31056997L, 31056997L, 31056997L, 31056997L, 31056997L,
31056997L, 31056997L, 31056997L)), .Names = c("long", "lat",
"group", "order", "region", "subregion", "population"), row.names = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 12L, 13L, 14L, 15L, 16L,
17L, 18L, 19L, 20L, 21L), class = "data.frame")
在数据框的其余部分,有一些国家的NA人口价值。我想要一种方法来插入具有NA人口值的这些国家的人口值。
例如,我想把int的人口值加到美国
world %>%
filter(region == "USA") %>%
mutate(population = 298444215)
这使得单独的数据帧仅包含USA的数据。但是,我最好只想在我上面输入的整个数据帧中改变美国的人口值!
答案 0 :(得分:0)
由于您使用的是dplyr
,因此这是一个简单的解决方案:
df <- df %>%
mutate(population = case_when(region == "USA" ~ 298444215,
TRUE ~ as.double(population)))
您可以将所需的所有国家/地区手动分配人口价值添加到case_when
的一次调用中。
答案 1 :(得分:0)
不使用任何套餐,我们可以尝试
## find out the index which region is USA and population is NA
index_usa <- which(world$region == "USA" & is.na(world$population))
## fill the population column according to the index
world[index_usa, ]$population <- 298444215
答案 2 :(得分:0)
简单的索引适用于带有基础R的单行:
world[, 'population'][world[, 'region'] == 'USA'] <- 298444215
这就是说,大致翻译成英文;
world[, 'population']
为人口专栏[world[, 'region'] == 'USA']
位于region列为USA的行<- 298444215
插入此值