我想知道如何在R中组合data.tables。我需要这个用于机器学习项目,我只是让它更容易一些。我们使用机器学习预测了职业代码,现在想要将预测代码与职业代码相结合,这些职业代码类似于不是在措辞方面预测的职业代码,而是在职业活动的相似性方面,以提高后来在职业活动中使用的代码的准确性。面试
我将以下代码作为示例:
id<-rep(c(1:4), each=25)
name<-rep(c("Hans", "Peter", "Klaus", "Florian"), each=25)
table<- data.table(ID=id, NAME=name)
id1<-c(1,1,1,2,3,3,4,4,4,5,5,5,5,6,6)
name1<-c("Hans", "Hans", "Hans", "Peter", "Klaus", "Klaus", "Florian", "Florian", "Florian", "Helmut", "Helmut", "Helmut", "Helmut", "Karl", "Karl")
refid<-6+c(seq(1:15))
refname<-c("Claudia","Julia", "Sophie","Lara","Lea","Sarah","Marie","Lena","Leonie","Anna","Jana","Maria","Susanne","Merle","Simone")
reftable<-data.table(ID=id1, NAME=name1, REFID=refid,REFNAME=refname)
所以,我想创建一个新表,在那里我将表中的所有4个男性名称列为唯一值,然后我想创建多个列,每个列包含reftable中的一个女性名称(不包括那些来自reftable的名称) #39; t出现在表格中)。
感谢您的帮助!!!
答案 0 :(得分:0)
uniquetqble <- data.table(NAME = unique(table$NAME))
mergetable <- reftable[uniquetqble, on = "NAME"]
ID NAME REFID REFNAME
1: 1 Hans 7 Claudia
2: 1 Hans 8 Julia
3: 1 Hans 9 Sophie
4: 2 Peter 10 Lara
5: 3 Klaus 11 Lea
6: 3 Klaus 12 Sarah
7: 4 Florian 13 Marie
8: 4 Florian 14 Lena
9: 4 Florian 15 Leonie
想法是先合并
dcast(mergetable,NAME ~ REFNAME)
NAME Claudia Julia Lara Lea Lena Leonie Marie Sarah Sophie
1: Florian NA NA NA NA Lena Leonie Marie NA NA
2: Hans Claudia Julia NA NA NA NA NA NA Sophie
3: Klaus NA NA NA Lea NA NA NA Sarah NA
4: Peter NA NA Lara NA NA NA NA NA NA
,然后将大(或宽)格式的数据转换为每个女孩名称的列