结合R

时间:2018-01-18 12:46:26

标签: r data.table

我想知道如何在R中组合data.tables。我需要这个用于机器学习项目,我只是让它更容易一些。我们使用机器学习预测了职业代码,现在想要将预测代码与职业代码相结合,这些职业代码类似于不是在措辞方面预测的职业代码,而是在职业活动的相似性方面,以提高后来在职业活动中使用的代码的准确性。面试

我将以下代码作为示例:

id<-rep(c(1:4), each=25)

name<-rep(c("Hans", "Peter", "Klaus", "Florian"), each=25)

table<- data.table(ID=id, NAME=name)

id1<-c(1,1,1,2,3,3,4,4,4,5,5,5,5,6,6)

name1<-c("Hans", "Hans", "Hans", "Peter", "Klaus", "Klaus", "Florian", "Florian", "Florian", "Helmut", "Helmut", "Helmut", "Helmut", "Karl", "Karl")

refid<-6+c(seq(1:15))

refname<-c("Claudia","Julia", "Sophie","Lara","Lea","Sarah","Marie","Lena","Leonie","Anna","Jana","Maria","Susanne","Merle","Simone")

reftable<-data.table(ID=id1, NAME=name1, REFID=refid,REFNAME=refname)

所以,我想创建一个新表,在那里我将表中的所有4个男性名称列为唯一值,然后我想创建多个列,每个列包含reftable中的一个女性名称(不包括那些来自reftable的名称) #39; t出现在表格中)。

感谢您的帮助!!!

1 个答案:

答案 0 :(得分:0)

uniquetqble <- data.table(NAME = unique(table$NAME))
mergetable <- reftable[uniquetqble, on = "NAME"]
   ID    NAME REFID REFNAME
1:  1    Hans     7 Claudia
2:  1    Hans     8   Julia
3:  1    Hans     9  Sophie
4:  2   Peter    10    Lara
5:  3   Klaus    11     Lea
6:  3   Klaus    12   Sarah
7:  4 Florian    13   Marie
8:  4 Florian    14    Lena
9:  4 Florian    15  Leonie

想法是先合并

dcast(mergetable,NAME ~ REFNAME)
      NAME Claudia Julia Lara Lea Lena Leonie Marie Sarah Sophie
1: Florian      NA    NA   NA  NA Lena Leonie Marie    NA     NA
2:    Hans Claudia Julia   NA  NA   NA     NA    NA    NA Sophie
3:   Klaus      NA    NA   NA Lea   NA     NA    NA Sarah     NA
4:   Peter      NA    NA Lara  NA   NA     NA    NA    NA     NA

,然后将大(或宽)格式的数据转换为每个女孩名称的列