How to apply if statements when the condition is taken from a different dataframe, but with a common variable

时间:2019-03-19 14:59:54

标签: r

I have a blank encounter history dataframe with all zeros. I want to fill it with the value '1' where there is an encounter in a specific year.

My data file (datafile) looks somewhat like this:

Date               Name
2007-04-28          a
2007-05-19          a
2007-05-21          b                
2008-04-28          a
2009-05-06          c  

And the 'empty' data-frame (encounter) that has to be recoded

Name  2007   2008   2009   2010
a      0      0      0      0
b      0      0      0      0
c      0      0      0      0
d      0      0      0      0
e      0      0      0      0

I tried using an if statement:

datafile$Date%>%if(datafile$Date==between(01-01-07&31-12-07)) {encounter$2007=="1"}

But got an error

Error in between(1 - 1 - 7 & 31 - 12 - 7) : 
  between has been x of type logical
In addition: Warning message:
In if (.) datafile$Date == between(1 - 1 - 7 & 31 - 12 - 7) else { :
  the condition has length > 1 and only the first element will be used

1 个答案:

答案 0 :(得分:1)

有很多方法可以满足您的需求。 (数据始终位于底部。)

library(dplyr)
datafile %>%
  transmute(Year = format(Date, "%Y"), Name) %>%
  xtabs(data = ., ~ Name + Year)
#     Year
# Name 2007 2008 2009
#    a    2    1    0
#    b    1    0    0
#    c    0    0    1

尽管会产生"xtabs" "table"类的对象,而不是框架。为此,您可以使用:

library(tidyr)
encounters <- datafile %>%
  transmute(Year = format(Date, "%Y"), Name) %>%
  group_by(Year, Name) %>%
  tally() %>%
  tidyr::spread(Year, n) %>%
  mutate_at(vars(-Name), ~ replace(., is.na(.), 0))
encounters
# # A tibble: 3 x 4
#   Name  `2007` `2008` `2009`
#   <chr>  <dbl>  <dbl>  <dbl>
# 1 a          2      1      0
# 2 b          1      0      0
# 3 c          0      0      1

您的代码有些问题。

我认为您打算将Date列传递到between,因此类似这样的事情可能与您尝试做的事情更接近:

datafile$Date %>%
  between(as.Date("2007-01-01"), as.Date("2007-12-31"))
# [1]  TRUE  TRUE  TRUE FALSE FALSE

但这不能帮助我们分配特定的值。这不会立即允许您将新值分配回框架,但是至少我可以帮助您解决对between的使用。

此外,%>%运算符/函数正在向前传递数据,它不允许立即在其他位置进行分配。您可以伪造它,但是我不认为它是打算工作的。并且由于此条件向量是根据datafile(这是一个“形状”)创建的,并且您想将值分配给encounters(这是一个完全不同的“形状”),因此您将遇到逻辑问题确实最好避免。


数据:

datafile <- read.table(header=TRUE, stringsAsFactors=FALSE, text='
Date               Name
2007-04-28          a
2007-05-19          a
2007-05-21          b                
2008-04-28          a
2009-05-06          c')
datafile$Date <- as.Date(datafile$Date)