整理几张桌子

时间:2019-06-06 22:45:10

标签: r

我有三个表作为数据帧读入R。

表1:

Student ID   School_Name             Campus Area
   4356791          BCCS    Northwest Springdale
    03127.           BZS            South Vernon
    12437.          BCCS.           South Vernon

表2:

ProctorID.           Date.    Score.   Student ID      Form#
      0211        10/05/16     75.57        55612   25432178
      0211        10/17/16     83.04        55612   47135671
      5134        10/17/16     63.28        02613    2371245

表3:

ProctorID         First.       Last.              Campus Area
    O211.         Simone      Lewis.     Northwest Springdale 
    5134.          Mona.   Yashamito     Northwest Springdale 
    0712.        Steven.      Lewis.             South Vernon

我想结合数据框并创建一个表格,其中每个区域的分数均按学校名称相邻。我想要类似以下的输出:

School_Name      Form#   Northwest Springdale    Southvernon   
      BCCS.   2543127.                 83.04.          63.25       
      BCCS.     35674.                 75.14.              *
       BZS.   5321567.                  65.2.           62.3

针对特定学校的特定表格可能没有针对特定区域的分数。有任何想法吗?我一直在玩sqldf包。还可以在不使用任何sql的情况下在R中进行操作吗?

1 个答案:

答案 0 :(得分:0)

要投射,是这样的:

library(reshape2)
casted_df <- dcast(df, ... ~ "Campus Area", value.var="Score.")

一个似乎对我有用的例子:

df1 <- data.frame("StudentID" = 1:3, "SchoolName" = c("School1", "School2", "School3"), "Area" = c("Area1", "Area2", "Area3"))
df2 <- data.frame("StudentID" = 1:3, "Score" = 100:102, "Proctor" = 4:6)
df3 <- data.frame("Proctor" = 4:6, "Area" = c("Area1", "Area2", "Area3"), "Name" = c("John", "Jane", "Jim"))

combined <- merge(df1, df2, by.x = "StudentID")
combined2 <- merge(combined, df3, by.x = "Proctor", by.y="Proctor")

library(reshape2)
final <- dcast(combined2, ... ~ Area.x, value.var="Score")

df image