r - 合并数据框时保留索引

时间:2018-03-06 06:58:46

标签: r merge

我在R中使用College数据帧。我有以下代码

input.df = College
# Making df.train similar to input.df but with zero rows. 
df.train = input.df[0,]
temp.split = input.df[input.df[,1] == "Yes",]
sample.size = floor(0.75 * nrow(temp.split))
train.ind = sample(seq_len(nrow(temp.split)),size = sample.size)
temp.train = temp.split[train.ind, ]
df.train = merge(x = df.train, y = temp.train, all = TRUE)

合并后我松开了索引。

head(input.df)
                             Private Apps Accept Enroll Top10perc Top25perc 
Abilene Christian University     Yes 1660   1232    721        23        52   
Adelphi University               Yes 2186   1924    512        16        29            
Adrian College                   Yes 1428   1097    336        22        50            
Agnes Scott College              Yes  417    349    137        60        89             
Alaska Pacific University        Yes  193    146     55        16        44             
Albertson College                Yes  587    479    158        38        62         

head(temp.train, n = 4)
                            Private  Apps Accept Enroll Top10perc Top25perc 
University of South Florida   No  7589   4676   1876        29        63    
Virginia Tech                 No 15712  11719   4277        29        53     
Valley City State University  No   368    344    212         5        27    
Winthrop University           No  2320   1805    769        24        61    

head(df.train)    
      Private Apps Accept Enroll Top10perc Top25perc F.Undergrad P.Undergrad 
1         No  233    233    153         5        12         658          58     
2         No  285    280    208        21        43        1140         473     
3         No  368    344    212         5        27         863         189     
4         No  434    412    319        10        30        1376         237     
5         No  441    369    172        17        45         633         317     
6         No  480    405    380        19        46        1673        1014     

以上输出被截断以适合窗口

你可以看到"南佛罗里达大学","弗吉尼亚理工大学"合并后丢失等。有没有办法保留它们?

1 个答案:

答案 0 :(得分:2)

很遗憾,您无法使用merge(即保留row.names)。

请注意,例如dplyr::left_join()实际上会遇到同样的问题。

我担心你必须在row.names期间暂时加入merge,例如:

df.train = transform(merge(
  x = df.train,
  y = cbind(rownames = rownames(temp.train), temp.train),
  all = TRUE
), row.names = rownames, rownames = NULL)
相关问题