我在R中使用sqldf包。我有2个数据集,
完整列表:
Student1
Student2
Student3
Student4
Student5
提交清单:
Student1
Student2
Student5
我想在完整列表中添加一列,并输入1或0,具体取决于学生是否已提交作业。所以最终的完整列表看起来像
Student1 1
Student2 1
Student3 0
Student4 0
Student5 1
执行此操作的R代码和sql(sqlite?)代码是什么? (两者都是为了澄清)
答案 0 :(得分:0)
您可以向“已提交”列表添加一个值为1的列。然后,可以使用dplyr
或sqldf
连接这两个表。最后,可以在一列中添加0,表示是否在最终表格中提交了作业。
library(data.table)
library(dplyr)
library(sqldf)
full_list <- data.frame(x = c("Student1", "Student2", "Student3", "Student4", "Student5"))
submitted_list <- data.frame(x = c("Student1", "Student2", "Student5"))
setDT(submitted_list)
submitted_list <- submitted_list[, assin_completed := 1L]
# using dplyr
dt <- left_join(full_list, submitted_list, by = "x")
# or using sqldf
dt <- sqldf("select full_list.x, submitted_list.assin_completed from full_list left outer join submitted_list on full_list.x = submitted_list.x")
setDT(dt)
dt <- dt[is.na(assin_completed), assin_completed := 0L]
决赛桌dt
将提供您想要的输出。
x assin_completed
1: Student1 1
2: Student2 1
3: Student3 0
4: Student4 0
5: Student5 1
答案 1 :(得分:0)
1)in 使用最后注释中定义的输入:
library(sqldf)
sqldf("select Student, (Student in SubmittedDF) Submitted from FullDF")
,并提供:
Student Submitted
1 Student1 1
2 Student2 1
3 Student3 0
4 Student4 0
5 Student5 1
2)左连接/合并另一种方法是将Submitted1
定义为SubmittedDF
,但是第二列为1(名为Submitted
)并且然后将FullDF
数据帧加入到它中,用0替换连接生成的NULL值。
library(sqldf)
sqldf("with Submitted1 as (select *, 1 Submitted from SubmittedDF)
select Student, coalesce(Submitted, 0) Submitted
from FullDF left join Submitted1 using(Student)")
,并提供:
Student Submitted
1 Student1 1
2 Student2 1
3 Student3 0
4 Student4 0
5 Student5 1
3)%in%关于普通R代码(没有包),我们可以像这样使用%in%
:
transform(FullDF, Submitted = (Student %in% SubmittedDF$Student) + 0)
,并提供:
Student Submitted
1 Student1 1
2 Student2 1
3 Student3 0
4 Student4 0
5 Student5 1
4)合并/替换另一种方法是使用merge
执行左连接,然后使用replace
将连接生成的NA更改为0。
Submitted1 <- cbind(SubmittedDF, Submitted = 1)
transform(merge(FullDF, Submitted1, all.x = TRUE),
Submitted = replace(Submitted, is.na(Submitted), 0))
注意:以可重现的形式输入:
Lines1 <- "Student
Student1
Student2
Student3
Student4
Student5"
Lines2 <- "Student
Student1
Student2
Student5"
FullDF <- read.table(text = Lines1, header = TRUE, strip.white = TRUE)
SubmittedDF <- read.table(text = Lines2, header = TRUE, strip.white = TRUE)