根据计数替换NA的值

时间:2020-10-08 10:37:48

标签: r dplyr

有时,我在一个字段中具有NA数据,在这种情况下为Eastings(但在实际数据中,将同时缺少Eastings和northings)。 我想做的就是看我的数据框并说: 对于在Easting中具有NA的那些行,创建一个变量(在这种情况下,称为“ eh”),然后告诉我缺少地理坐标的投票区的名称。 使用此变量来过滤数据框。通过轮询地区和东,北对数据帧进行分组,计算唯一的行,然后用最常见的东/北组合替换NA值。 如您所见,我已经进入了小组计算阶段,但我该如何: 将NA Easting替换为Easting值以获取最大值n?

 #load the packages
    library(easypackages)
    packages("tidyverse","readxl","sf","tmaptools","tmap","lubridate",
             "lwgeom","Cairo","nngeo","purrr","scales", "ggthemes","janitor")
    
    #lets get some data
    polls<-read.csv(url("https://www.caerphilly.gov.uk/CaerphillyDocs/FOI/Datasets_polling_stations_csv.aspx"))%>%
      mutate(date = sample(seq(as.Date('2020/01/01'), as.Date('2020/05/31'), by="day"), 147))
    
    #lets add some random na
    str(polls)
    #which polling district name has high count?
    polls%>%count(Polling.District.Name,Easting)%>%arrange(n)
    
    #replace the eastings for that with NA and some others
    polls$Easting<-na_if(polls$Easting,314057)
    polls$Easting
    
    polls%>%filter(is.na(Easting))%>%select(Polling.District.Name)%>%pull->eh
    
    polls%>%filter(Polling.District.Name%in%eh)%>%
      select(Polling.District.Name,Easting,Northing)%>%
      group_by(Polling.District.Name,Easting,Northing)%>%count()

  

0 个答案:

没有答案
相关问题