通过两个不同的键将列合并到数据框

时间:2018-10-08 17:23:59

标签: r dataframe merge

下面的数据帧df.freq充满了单词及其属性(例如频率,长度等)。

 df.freq
 'data.frame':  221324 obs. of  7 variables:
 $ Word         : Factor w/ 221324 levels "a","aa-class",..: 195399 6167 198867 90289 1 131901 91600 95885 195346 95685 ...
 $ BlogFreqPm   : num  48737 28649 27965 23737 23630 ...
 $ TwitterFreqPm: num  30241 14145 25420 29598 19788 ...
 $ NewsFreqPm   : num  56009 25139 25590 5516 25291 ...
 $ CumFreqPm    : num  134987 67932 78975 58851 68709 ...
 $ LogCumFreq   : num  11.8 11.1 11.3 11 11.1 ...
 $ Length       : int  3 3 2 1 1 2 2 2 4 2 ...

我需要merge上面数据框中的列LogCumFreqLength以及下面的数据帧df.words

 df.words
 Classes ‘grouped_df’, ‘tbl_df’, ‘tbl’ and 'data.frame':    
 $ target                : chr  "HAT" "DEPART" "MUD" "LUST" ...
 $ prime                 : chr  "hat" "department" "muddy" "luster" ...
 ...

我需要做的是应用merge,以便将LogCumFreq中的变量Lengthdf.freq插入到两行不同的每一行中其中分别包含primetarget的值。

我尝试先对merge使用prime,然后对target使用dput,但是由于两个值始终位于同一行,因此它们会相互覆盖。有人知道该怎么做吗?

编辑: 数据帧的df.words <- structure(list(prime = structure(c(2L, 1L, 5L, 4L, 3L), .Label = c("department", "hat", "hunter", "luster", "muddy"), class = "factor"), target = structure(c(2L, 1L, 4L, 3L, 5L), .Label = c("DEPART", "HAT", "LUST", "MUD", "SPY"), class = "factor")), class = "data.frame", row.names = c(NA, -5L)) df.freq <- structure(list(word = structure(c(3L, 2L, 8L, 6L, 4L, 1L, 7L, 5L, 9L), .Label = c("depart", "department", "hat", "hunter", "lust", "luster", "mud", "muddy", "spy"), class = "factor"), freq = c(4.3, 5.323, 9.9, 2, 0.56, 4.5, 6.99, 10.88, 7), length = c(3L, 10L, 5L, 6L, 6L, 6L, 3L, 4L, 3L)), row.names = c(NA, -9L), class = "data.frame") 示例如下。

df.words.freq <- 

structure(list(prime = structure(c(2L, 1L, 5L, 4L, 3L), .Label = c("department", 
"hat", "hunter", "luster", "muddy"), class = "factor"), target = structure(c(2L, 
1L, 4L, 3L, 5L), .Label = c("DEPART", "HAT", "LUST", "MUDDY", 
"SPY"), class = "factor"), freq.prime = c(4.3, 5.323, 9.9, 2, 
0.56), freq.target = c(4.3, 4.5, 6.99, 10.88, 7), length.prime = c(3, 
10, 5, 6, 6), length.target = c(3, 6, 3, 4, 3)), row.names = c(NA, 
-5L), class = "data.frame")

以下是所需输出的示例:

body {
    margin: 0;
    font-size: 16px;
}

.key {
    font-size: 1.5em;
}

#result {
    text-align: right;
    font-size: 3.5em;
    padding: 0 20px;
    box-sizing: border-box;
    display: flex;
    flex-direction: column;
    justify-content: center;
    width: 100%;
    height: 28%;
    background: #EEE;
    color: #444;
    font-family: DigitalNumbers;
}

#app{
    height: 70%;
    width: 80%;
    max-width: 580px;
    min-height: 280px;
    background: bisque;
    position: absolute;
    left: 50%;
    top: 50%;
    transform: translate(-50%,-50%);


}


.keyy{
    grid-area: plus;
}

#keypad{
    display: grid;
    grid-template-areas: 'auto auto';
    height: 72%;
    width: 100%;
    color: #DDD;
    font-family: RobotoCondensed;

}

#nums{
    display: grid;
    grid-template-areas: 
    '. . .'
    '. . .'
    '. . .'
    '. . .';
    /* no dimensions */
}

#ops{
    display: grid;
    grid-template-areas: 
    '× .'
    '- .'
    'plus .'
    'plus .';
    /* no dimensions */
}

div#nums div:nth-child(odd){
    background: #095057;
}

div#nums div:nth-child(even){
    background: #19676E;
}

div#ops div:nth-child(odd){
    background: #D34E47;
}

div#ops div:nth-child(even){
    background: #B52D26;
}

2 个答案:

答案 0 :(得分:0)

这只是两个合并。这里的大部分工作是获取所需的列名称:

app:layout_behavior="@string/appbar_scrolling_view_behavior"

如果需要,可以使用result = merge(df.words, setNames(df.freq, nm = paste(names(df.freq), "prime", sep = ".")), by.x = "prime", by.y = "word.prime") result$target = tolower(result$target) result = merge(result, setNames(df.freq, nm = paste(names(df.freq), "target", sep = ".")), by.x = "target", by.y = "word.target") # target prime freq.prime length.prime freq.target length.target # 1 depart department 5.323 10 4.50 6 # 2 hat hat 4.300 3 4.30 3 # 3 lust luster 2.000 6 10.88 4 # 4 mud muddy 9.900 5 6.99 3 # 5 spy hunter 0.560 6 7.00 3 toupper转换为大写事后格式。

答案 1 :(得分:0)

您将必须分两步进行合并,然后根据需要使用names()colnames()

重命名列
df1 <- merge(df.words, df.freq, by.x = "prime", by.y = "word", all.x = TRUE)
df1$targetword <- tolower(df1$target)   #to match the keywords

df2 <- merge(df1, df.freq, by.x = "targetword", by.y = "word", all.x = TRUE)
df2$targetword <- NULL