Evaluating function arguments to pass to data.table

时间:2017-08-30 20:35:30

标签: r data.table

I have this piece of code that I'd like to wrap in a function

Traceback (most recent call last):
File "test2.py", line 16, in <module>
listener.join()

File "C:\Python27\lib\site-packages\pynput\_util\__init__.py", line 185, in join
six.reraise(exc_type, exc_value, exc_traceback)

File "C:\Python27\lib\site-packages\pynput\_util\__init__.py", line 140, in inner
return f(self, *args, **kwargs)

File "C:\Python27\lib\site-packages\pynput\keyboard\_win32.py", line 232, in _process
key = self._event_to_key(msg, vk)

File "C:\Python27\lib\site-packages\pynput\keyboard\_win32.py", line 265, in _event_to_key
msg in self._PRESS_MESSAGES))

File "C:\Python27\lib\site-packages\pynput\keyboard\_base.py", line 49, in __init__
'COMBINING ' + unicodedata.name(self.char))

KeyError: "undefined character name 'COMBINING EXCLAMATION MARK'"

So far so good. Works as desired. Now it should be wrapped in a function, where I can pass alternative variable names (without quoting if possible), and here's my first try.

When upgrading instances of Apache Tomcat from one version of Tomcat 8 to 
another, particularly when using separate locations for $CATALINA_HOME and 
$CATALINA_BASE, it is necessary to ensure that any changes in the 
configuration files such as new attributes and changes to defaults are 
applied as part of the upgrade. To assist with the identification of these 
changes, the form below may be used to view the differences between the 
configuration files in different versions of Tomcat 8.

I know this isn't working but lets look at the error it throws

<md-sidenav-container fullscreen> 
  <md-sidenav #sidepanel>...</md-sidenav>
  <div class="app">
    <main>
      <router-outlet></router-outlet>
    </main>
  </div>
</md-sidenav-container>

Error in as.vector(x, "list") : cannot coerce type 'closure' to vector of type 'list'

That makes sense. Now to make sure that the promises are evaluated correctly I tried this

indata <- data.frame(id = c(1L, 2L, 3L, 4L, 12L, 13L, 14L, 15L), 
                     fid = c(NA, 9L, 1L, 1L, 7L, 5L, 5L, 5L), 
                     mid = c(0L, NA, 2L, 2L, 6L, 6L, 6L, 8L))
library(data.table)
DT <- as.data.table(indata)

DT[, msib:=.(list(id)), by = mid][                                              
   ,msibs := mapply(setdiff, msib, id)][
   ,fsib  := .(list(id)), by = fid][
   ,fsibs := mapply(setdiff, fsib, id)][
   ,siblist  := mapply(union, msibs, fsibs)][
   ,c("msib","msibs", "fsib", "fsibs") := NULL] 

That doesn't throw an error but also does not work. The output looks like this

f <- function(DT, id, fid, mid) {

    DT[, msib:=.(list(id)), by = mid][                                              
       ,msibs := mapply(setdiff, msib, id)][
       ,fsib  := .(list(id)), by = fid][
       ,fsibs := mapply(setdiff, fsib, id)][
       ,siblist  := mapply(union, msibs, fsibs)][
       ,c("msib","msibs", "fsib", "fsibs") := NULL] 
}

and the indata2 <- indata names(indata2) <- c("A", "B", "C") # Give new names DT2 <- as.data.table(indata2) f(DT2, A, B, C) column is empty which it shouldn't be and isn't when I run it manually. I also tried this version (converting it to character strings) to see if that worked:

f <- function(DT, id, fid, mid) {
    mid <- deparse(substitute(mid))
    id <- deparse(substitute(id))
    fid <- deparse(substitute(fid))

    DT[, msib:=.(list(id)), by = mid][                                              
       ,msibs := mapply(setdiff, msib, id)][
       ,fsib  := .(list(id)), by = fid][
       ,fsibs := mapply(setdiff, fsib, id)][
       ,siblist  := mapply(union, msibs, fsibs)][
       ,c("msib","msibs", "fsib", "fsibs") := NULL] 
}

but that doesn't work either - same output as above. I think it may be because the promises in the f(DT2, A, B, C) A B C siblist 1: 1 NA 0 2: 2 9 NA 3: 3 1 2 4: 4 1 2 5: 12 7 6 6: 13 5 6 7: 14 5 6 8: 15 5 8 part of the siblist are evaluated in the wrong environment but am not sure. How can I fix my function?

2 个答案:

答案 0 :(得分:1)

如果您希望某个对象具有某种结构或保存某些数据,那么定义一个类可以提供帮助。而使用S3,它很简单。

as.relationship <- function(DT, id, fid, mid) {
  out <- DT[, c(id, fid, mid), with = FALSE]
  setnames(out, c("id", "fid", "mid"))
  setattr(out, "class", c("relationship", class(out)))
  out
}

然后你可以编写一个函数来处理那个类,并且知道一切都在哪里。

f <- function(DT, id, fid, mid) {
  relatives <- as.relationship(DT, id, fid, mid)
  relatives[
    relatives,
    on = "fid",
    allow.cartesian = TRUE
  ][
    relatives,
    on = "mid",
    allow.cartesian = TRUE
  ][
    ,
    {
      siblings    <- union(i.id, i.id.1)
      except_self <- setdiff(siblings, .BY[["id"]])
      list(siblist = list(except_self))
    },
    by = "id"
  ]
}

此函数将列名称作为字符串。所以你会这样称呼它:

f(DT, "id", "fid", "mid")
#    id  siblist
# 1:  1         
# 2:  2         
# 3:  3        4
# 4:  4        3
# 5: 12    13,14
# 6: 13 14,15,12
# 7: 14 13,15,12
# 8: 15    13,14

setnames(DT, c("A", "B", "C"))
f(DT, "A", "B", "C")
#    id  siblist
# 1:  1         
# 2:  2         
# 3:  3        4
# 4:  4        3
# 5: 12    13,14
# 6: 13 14,15,12
# 7: 14 13,15,12
# 8: 15    13,14

如果你担心表现,不要。如果您从另一个data.table的整列创建data.table,那么他们就足够聪明,不会实际复制数据。他们分享它。因此,制作另一个对象并没有真正的性能损失。

答案 1 :(得分:0)

这变得丑陋但似乎有效。有很多get() s:

f <- function(DT, id, fid, mid) {
  mid <- deparse(substitute(mid))
  id <- deparse(substitute(id))
  fid <- deparse(substitute(fid))

  DT[, msib:=.(list(get(id))), by = get(mid)][                                              
    ,msibs := mapply(setdiff, msib, get(id))][
      ,fsib  := .(list(get(id))), by = get(fid)][
        ,fsibs := mapply(setdiff, fsib, get(id))][
          ,siblist  := mapply(union, msibs, fsibs)][
            ,c("msib","msibs", "fsib", "fsibs") := NULL] 
}

DT2 <- as.data.table(indata2)
f(DT2, A, B, C)

all.equal(DT, DT2)
# [1] "Different column names"