我有这个数据框。我想汇总数据,以便一列显示启动总数,下一列显示失败启动总数。
state_name launch_year category
1 United States 1958 Success
2 United States 1958 Success
3 United States 1958 Success
4 United States 1958 Failure
5 United States 1958 Failure
6 United States 1958 Failure
7 Soviet Union 1957 Success
8 Soviet Union 1957 Success
9 Soviet Union 1958 Success
10 Soviet Union 1959 Success
11 Soviet Union 1959 Success
12 Soviet Union 1959 Success
13 Soviet Union 1958 Failure
14 Soviet Union 1958 Failure
15 Soviet Union 1958 Failure
16 Soviet Union 1958 Failure
17 Soviet Union 1959 Failure
18 United States 1959 Success
19 United States 1959 Failure
20 United States 1958 Success
21 United States 1959 Success
22 United States 1959 Failure
23 United States 1958 Success
24 United States 1958 Success
25 United States 1959 Success
26 United States 1959 Success
27 United States 1959 Success
28 United States 1959 Success
29 United States 1959 Success
30 United States 1959 Success
31 United States 1959 Success
32 United States 1958 Failure
33 United States 1958 Failure
34 United States 1959 Failure
35 United States 1959 Failure
36 United States 1959 Failure
37 United States 1958 Success
38 United States 1959 Success
39 United States 1959 Success
40 United States 1957 Failure
41 United States 1958 Failure
42 United States 1958 Failure
43 United States 1958 Failure
44 United States 1958 Failure
45 United States 1958 Failure
46 United States 1958 Failure
47 United States 1958 Failure
48 United States 1958 Failure
49 United States 1958 Failure
50 United States 1958 Failure
51 United States 1959 Failure
52 United States 1959 Failure
每一行代表一个发射。类别是发布的结果。
我想把它变成这样。
state_name launch_year launches failed_launches
1 United States 1957 1 1
2 Soviet Union 1957 2 0
3 United States 1958 22 15
4 Soviet Union 1958 5 4
5 United States 1959 4 3
6 Soviet Union 1959 18 1
我曾尝试仅过滤失败的启动,然后添加一个failed_launch
列,但我不知道如何从那里返回其余数据。
launches %>%
filter(category == "Failure") %>%
count(state_name, launch_year) %>%
mutate(failed_launches = n)
答案 0 :(得分:4)
可以做到:
df %>%
group_by(state_name, launch_year) %>%
summarise(
launches = n(),
failed_launches = sum(category == "Failure")
)