Question

我有一些格式

的文件（＆＃34; intersect.text.table＆＃34;）

  textid  rate1  rate2  rate3
     foo    200    201    200
     bar   none    203    203
 crowbar    201    200   none
kung-foo    200    201    202

我还有一个文件（＆＃34; intersect.text.table＆＃34;），其中包含一些textid的列表

textid
foor
bar
foo

所以，我希望得到textid s文件列表的交集表。在这个模型中，它将是：

  textid  rate1  rate2  rate3
     foo    200    201    200
     bar   none    203    203

如何使用R？

我尝试了评论的建议。但是，我得到了错误的答案。有一个代码：

> x = read.table("intersect.text.table", head=TRUE)
> y = read.table("intersect.text.list", head=TRUE)

> x[y$textid %in% x$textid, ]
   textid rate1 rate2 rate3
2     bar  none   203   203
3 crowbar   201   200  none

Answer 1

使用data.table，

library(data.table)#v1.9.5+
setDT(df1, key='textid')[df2, nomatch=0]
#    textid rate1 rate2 rate3
#1:    bar  none   203   203
#2:    foo   200   201   200

数据

df1 <- structure(list(textid = c("bar", "crowbar", "foo", "kung-foo"
), rate1 = c("none", "201", "200", "200"), rate2 = c(203L, 200L, 
201L, 201L), rate3 = c("203", "none", "200", "202")), .Names = c("textid", 
"rate1", "rate2", "rate3"), class = "data.frame", row.names = c(NA, 
-4L))


df2 <- structure(list(textid = c("foor", "bar", "foo")),
.Names = "textid", class = "data.frame", row.names = c(NA, 
-3L))

Answer 2

@RomanLuštrik在评论中提出的建议有点不对。由于您要对数据框x中的值进行子集化，因此您需要在x中找到y的值，而不是相反：

x[x$textid %in% y$textid, ]

如何在R中交叉表

2 个答案:

数据