假设我们具有以下数据框:
ID <- c(1, 1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 6, 6, 6)
age <- c(25, 25, 25, 22, 22, 56, 56, 56, 80, 33, 33, 90, 90, 90, 5, 5, 5)
gender <- c("m", "m", NA, "f", "f", "m", NA, "m", "m", "m", NA, NA, NA, "m", NA, NA, NA)
company <- c("c1", "c2", "c2", "c3", "c3", "c1", "c1", "c1", "c1", "c5", "c5", "c3", "c4", "c5", "c3", "c1", "c1")
income <- c(1000, 1000, 1000, 500, 1700, 200, 200, 250, 500, 700, 700, 300, 350, 300, 500, 1700, 200)
df <- data.frame(ID, age, gender, company, income)
在此数据中,我们有6个唯一的IDs
,如果您查看gender
变量,有时会包含NA
我想用正确的性别类别替换NAs
。另外,如果ID包含所有NA
的性别,请保持原样。
预期结果将是:
答案 0 :(得分:1)
使用tidyverse
库,您可以这样做
library(tidyverse)
# for each ID get the gender
df_gender_ref <- df %>% filter(!is.na(gender)) %>% select(ID,gender) %>% unique()
# add the new gender column to the original dataframe
df %>% select(-gender) %>% left_join(df_gender_ref)
答案 1 :(得分:1)
使用df$gender <- with(df, ave(gender, ID, FUN = function(x) na.omit(x)[1]))
ID age gender company income
1 1 25 m c1 1000
2 1 25 m c2 1000
3 1 25 m c2 1000
4 2 22 f c3 500
5 2 22 f c3 1700
6 3 56 m c1 200
7 3 56 m c1 200
8 3 56 m c1 250
9 3 80 m c1 500
10 4 33 m c5 700
11 4 33 m c5 700
12 5 90 m c3 300
13 5 90 m c4 350
14 5 90 m c5 300
15 6 5 <NA> c3 500
16 6 5 <NA> c1 1700
17 6 5 <NA> c1 200
-
dplyr
使用tidyr
和df %>%
group_by(ID) %>%
mutate(gender = na.omit(gender)[1])
df %>%
group_by(ID) %>%
fill(gender, .direction = "up") %>%
fill(gender, .direction = "down")
的某些方法-
ArrayList<Item> inv = DrawSurface.getInventory();
List<String> treasure = new ArrayList<>();
for (Item item: inv)
{
String newline = System.getProperty("line.separator");
treasure.add(item.name + newline);
}