Question

 df <- read.csv ('https://raw.githubusercontent.com/ulklc/covid19- 
 timeseries/master/countryReport/raw/rawReport.csv',
            stringsAsFactors = FALSE)

我处理了数据集。

我们能找到亚洲地区死亡最少的日子吗？

这里重要的事情；是亚洲地区所有国家/地区的死亡总数。因此，它是对日期进行分类和查找。

作为输出；

date region death

2020/02/17 asia 6300 (asia region sum)

我在输出中创建的数据是示例。示例中的数据不是真实的。

Answer 1

使用dplyr包进行数据处理：

df <- read.csv ('https://raw.githubusercontent.com/ulklc/covid19- 
 timeseries/master/countryReport/raw/rawReport.csv',
                stringsAsFactors = FALSE)
library(dplyr)

df_sum <- df %>% group_by(region,day) %>% # grouping by region and day
  summarise(death=sum(death)) %>% # summing following the groups
  filter(region=="Asia",death==min(death)) # keeping only minimum of Asia

那么您就有了：

> df_sum
# A tibble: 1 x 3
# Groups:   region [1]
  region day        death
  <fct>  <fct>      <int>
1 Asia   2020/01/22    17

数据集中区域值的汇总

1 个答案: