我有一个包含2018年所有销售额的数据集,并尝试进行pareto分析。该数据应该具有产品类别,但大多数都具有,但没有1/5。现在,我想用另一个数据框中的产品类别填充此NA值,但我失败了。
下面的简化示例:
df1 <- data.frame(ID = c("1000", "1000", "1000", "1000", "1010", "1020", "1030", "1030", "1000"),
name = c("A", "B", "C", "D", "A", "A", "B", "F", "G"),
group_ID = c(NA, NA, NA, NA, NA, NA, NA, NA, NA), stringsAsFactors = FALSE)
df2 <- data.frame(IDx = c("1000", "1000", "1000", "1000", "1000", "1000", "1000", "1000", "1000"),
group_ID = c("blankets", "blankets", "blankets", "blankets", "blankets", "blankets", "blankets", "blankets", "blankets"),
stringsAsFactors = FALSE)
df1[is.na(df1)] <- "None"
df1 %>%
left_join(df2, by = c("ID" = "IDx")) %>%
mutate(group_ID = coalesce(group_ID.y, group_ID.x)) %>%
select(-group_ID.x, -group_ID.y)
此代码的结果是以下数据框:
ID name group_ID
1 1000 A blankets
2 1000 A blankets
3 1000 A blankets
4 1000 A blankets
5 1000 A blankets
6 1000 A blankets
7 1000 A blankets
8 1000 A blankets
9 1000 A blankets
10 1000 B blankets
11 1000 B blankets
12 1000 B blankets
13 1000 B blankets
14 1000 B blankets
15 1000 B blankets
16 1000 B blankets
17 1000 B blankets
18 1000 B blankets
19 1000 C blankets
20 1000 C blankets
21 1000 C blankets
22 1000 C blankets
23 1000 C blankets
24 1000 C blankets
25 1000 C blankets
26 1000 C blankets
27 1000 C blankets
28 1000 D blankets
29 1000 D blankets
30 1000 D blankets
31 1000 D blankets
32 1000 D blankets
33 1000 D blankets
34 1000 D blankets
35 1000 D blankets
36 1000 D blankets
37 1010 A None
38 1020 A None
39 1030 B None
40 1030 F None
41 1000 G blankets
42 1000 G blankets
43 1000 G blankets
44 1000 G blankets
45 1000 G blankets
46 1000 G blankets
47 1000 G blankets
48 1000 G blankets
49 1000 G blankets
我不想要这个。我想要类似的东西:
ID name group_ID
1 1000 A blankets
2 1000 B blankets
3 1000 C blankets
4 1000 D blankets
5 1010 A None
6 1020 A None
7 1030 B None
8 1030 F None
9 1000 G blankets
我尝试了多次加入并在Internet上四处张望,但无法解决问题。
希望您能提供帮助!
答案 0 :(得分:0)
我认为unique(df1)
可能有用。
答案 1 :(得分:0)
data.table解决方案
样本数据
df1 <- data.frame(ID = c("1000", "1000", "1000", "1000", "1010", "1020", "1030", "1030", "1000"),
name = c("A", "B", "C", "D", "A", "A", "B", "F", "G"), stringsAsFactors = FALSE)
我省略了group_id列...您将使用联接创建该列。
df2 <- data.frame(IDx = c("1000", "1000", "1000", "1000", "1000", "1000", "1000", "1000", "1000"),
group_ID = c("blankets", "blankets", "blankets", "blankets", "blankets", "blankets", "blankets", "blankets", "blankets"),
stringsAsFactors = FALSE)
代码
library(data.table)
setDT(df1)[setDT(df2), group_ID := i.group_ID, on = .(ID = IDx)][]
我用setDT()
从data.frames df1和df2中创建了data.tables。其余的是通过引用左“简单”连接。
输出
# ID name group_ID
# 1: 1000 A blankets
# 2: 1000 B blankets
# 3: 1000 C blankets
# 4: 1000 D blankets
# 5: 1010 A <NA>
# 6: 1020 A <NA>
# 7: 1030 B <NA>
# 8: 1030 F <NA>
# 9: 1000 G blankets
答案 2 :(得分:0)
您可以使用this.http.get(`${this.apiUrl}/cinemas/location/cardiff`).pipe(
map((data: any) => data.cinemas),
switchMap((cinemas) => forkJoin(cinemas.map(value => <Observable<any>>this.http.get(`https://api.cinelist.co.uk/get/cinema/${value.id}`))
.pipe(map(cinema => {...cinema,value}))
}))
).subscribe(results => {
console.log(results);
});
。这是完整的代码:
distinct()