首先道歉,因为我无法提供完全可重复的例子,但请耐心等待。
我有一个数据表/数据框(根据'class')看起来像这样(称为variable.nuts1_MALE.counts):
country region N freq.1 result level delete_to.few.observations
1: DE DE2 187 15 8.41 1 1
2: DE DE1 142 9 7.30 1 1
3: DE DEA 231 19 8.75 1 1
4: DE DED 136 5 5.32 1 1
5: DE DE9 114 13 11.40 1 1
6: UK UKJ 147 14 6.35 1 1
7: UK UKD 108 12 7.36 1 1
然后我希望运行以下代码行,根据每个国家/地区有多少个区域添加一个额外的列(为5个或更多(即DE)添加1,为少于5添加0(即英国) ):
setDT(variable.nuts1_MALE.counts)[, delete_too.few.regions:= if(.N < 5) "0" else "1", by = unlist(country)]
variable.nuts1_MALE.regions <- subset(variable.nuts1_MALE.counts, delete_too.few.regions == 1)
这一直在处理我一直在运行的所有其他数据,但这次我收到错误消息:
Error in `[.data.table`(setDT(variable.nuts1_MALE.counts), , `:=`(delete_too.few.regions, :
'by' appears to evaluate to column names but isn't c() or key(). Use by=list(...) if you can. Otherwise, by=eval(unlist(country)) should work. This is for efficiency so data.table can detect which columns are needed.
任何人都可以告诉我们出了什么问题吗?
当我尝试建议时(可能很糟糕),我收到错误消息:
setDT(variable.nuts1_MALE.counts)[, delete_too.few.regions:= if(.N < 5) "0" else "1", by=eval(unlist(country))]
Error in unlist(country) : object 'country' not found
setDT(variable.nuts1_MALE.counts)[, delete_too.few.regions:= if(.N < 5) "0" else "1", by = list(country)]
Error in `[.data.table`(setDT(variable.nuts1_MALE.counts), , `:=`(delete_too.few.regions, :
column or expression 1 of 'by' or 'keyby' is type list. Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))]
当我输入表时,我似乎无法重现错误,但如果有人有任何替代建议,这里是数据。
variable.nuts1_MALE.counts <- structure(list(country = list("DE", "DE", "DE", "DE", "DE", "UK",
"UK"), region = c("DE2", "DE1", "DEA", "DED", "DE9", "UKJ",
"UKD"), N = c(187L, 142L, 231L, 136L, 114L, 147L, 108L), freq.1 = c(15L,
9L, 19L, 5L, 13L, 14L, 12L), result = c(8.41, 7.3, 8.75, 5.32,
11.4, 6.35, 7.36), level = c(1, 1, 1, 1, 1, 1, 1), delete_to.few.observations = c(1L,
1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("country", "region", "N",
"freq.1", "result", "level", "delete_to.few.observations"), class = c("data.table",
"data.frame"), row.names = c(NA, -7L))