我有一个数据框列表我想创建一个新变量" County",基于" State"的值。和"邮政编码"列。我知道这是lapply(df, transform)
是必要的情况。
State Zip
OH 44141
OH 44056
OH 44131
NY 13035
NY 13035
NY 13056
这适用于数据框,因此我不确定这如何转换为列表的应用程序
df$County[df$State == "OH" & df$Zip >= 44056 & df$Zip <= 44356]<- "Summit"
df$County[df$State == "NY" & df$Zip >= 1300 & df$Zip <= 13035]<- "Madison"
df$County[df$State == "NY" & df$Zip < 1300 | df$Zip > 13036] <- "Miscoded"
答案 0 :(得分:2)
假设您有以下列表。
($dayOf - $dow) %7
使用df_list <- structure(list(NY = structure(list(State = structure(c(1L, 1L,
1L), .Label = c("NY", "OH"), class = "factor"), Zip = c(13035L,
13035L, 13056L)), .Names = c("State", "Zip"), row.names = 4:6, class = "data.frame"),
OH = structure(list(State = structure(c(2L, 2L, 2L), .Label = c("NY",
"OH"), class = "factor"), Zip = c(44141L, 44056L, 44131L)), .Names = c("State",
"Zip"), row.names = c(NA, 3L), class = "data.frame")), .Names = c("NY",
"OH"))
和dplyr::mutate
,您可以执行类似
purrr::map
答案 1 :(得分:1)
您似乎有一个简单的data.frame
,因此您可以使用data.frame
直接操作transform
;这里不需要lapply
。
为了代码可读性,我建议使用tidyverse
:
case_when
解决方案
library(tidyverse)
df %>%
mutate(County = case_when(
State == "OH" & (Zip >= 44056 & Zip <= 44356) ~ "Summit",
State == "NY" & (Zip >= 1300 & Zip <= 13035) ~ "Madison",
State == "NY" & (Zip < 1300 | Zip > 13036) ~ "Micoded",
TRUE ~ "Undefined"))
# State Zip County
#1 OH 44141 Summit
#2 OH 44056 Summit
#3 OH 44131 Summit
#4 NY 13035 Madison
#5 NY 13035 Madison
#6 NY 13056 Micoded
在基地R你可以做到
transform(df, County = ifelse(...))
有嵌套的ifelse
条件,这不是很整洁(在我看来)。
请注意,代码中的"Micoded"
条件不正确;你需要一个逻辑OR:Zip < 1300 | Zip > 13036
。
df <- read.table(text =
"State Zip
OH 44141
OH 44056
OH 44131
NY 13035
NY 13035
NY 13056")
答案 2 :(得分:0)
您可以使用基数R:
查看您的数据,例如,您似乎无法为纽约州提供44056。采取这种假设,你可以做到:
> a=c(1299,13036,44055,44357)
> b=c("Miscoded","Madison","Miscoded","Summit")
> transform(df,county=b[findInterval(Zip,a)+1])
State Zip county
1 OH 44141 Summit
2 OH 44056 Summit
3 OH 44131 Summit
4 NY 13035 Madison
5 NY 13035 Madison
6 NY 13056 Miscoded
如果不考虑这个假设,那么你可以这样做:
df1
State Zip
1 OH 44141
2 OH 44056
3 OH 44131
4 NY 13035
5 NY 13035
6 NY 13056
7 NY 44141
df1$county<-b[findInterval(df1$Zip,a)+1]
transform(df1,
county=ifelse(paste(State,county)%in%c("OH Summit","NY Madison"),county,"Miscoded"))
State Zip county
1 OH 44141 Summit
2 OH 44056 Summit
3 OH 44131 Summit
4 NY 13035 Madison
5 NY 13035 Madison
6 NY 13056 Miscoded
7 NY 44141 Miscoded
如果您有数据框列表,请执行以下操作:
m=function(df1){
df1$county<-b[findInterval(df1$Zip,a)+1]
transform(df1,
county=ifelse(paste(State,county)%in%c("OH Summit","NY Madison")
,county,"Miscoded"))
}
lapply(df_list,m)