我不太确定如何说出我的问题。我认为我想要做的是创建一个循环,它接受数据帧行中的每个值,将其与另一个数据帧中的键匹配,并将该行的每列中的键值相加,并将其存储在新数据中具有相同尺寸的键的框架。
使用示例解释应该会容易得多。我是R和编程的完全新手,我还在学习词汇。
我有一个单词数据框,其中每列对应一个音素(独特的语音)。
Words_DF <- data.frame( word = c("CAT", "BAT", "APPLE"), Phoneme1 = c("K", "B", "AE"), Phoneme2 = c("AE", "AE", "P"), Phoneme3 = c("T", "T", "AH"), Phoneme4 = c("Null", "Null", "L"))
word Phoneme1 Phoneme2 Phoneme3 Phoneme4
1 CAT K AE T Null
2 BAT B AE T Null
3 APPLE AE P AH L
我有另一个数据框,其中每个音素对应一系列二进制值。
Phoneme_DF <- data.frame( phoneme = c("AE", "AH", "B", "K", "T", "P", "L"), is_consonant = c(0, 0, 1, 1, 1, 1, 1), is_labial = c(0, 0, 0, 0, 0, 1, 0))
phoneme is_consonant is_labial
1 AE 0 0
2 AH 0 0
3 B 1 1
4 K 1 0
5 T 1 0
6 P 1 1
7 L 1 0
我正试图想办法浏览Words_DF的每一行,然后查看Phoneme_DF中每个音素列中的值,并将它们加在一个新的数据框中,如下所示:
New_DF <- data.frame( word = c("CAT", "BAT", "APPLE"), consonants_in_word = c(2, 2, 3), labials_in_word = c(0, 1, 1))
word consonants_in_word labials_in_word
1 CAT 2 0
2 BAT 2 1
3 APPLE 2 1
我尝试编写一些循环,遍历Words_DF的每一行,每行内遍历每一列并在Phoneme_DF中查找该值,然后求和
New_DF <- data.frame( word = c("CAT", "BAT", "APPLE"), consonants_in_word = c(0, 0 , 0 ), labials_in_word = c(0, 0, 0))
for(i in 1:length(SAMPLE_Words)){
for(j in 1:length(where(SAMPE_Words[[j]]) %in% SAMPLE_Phoneme_DF[i])) {
rbind(New_DF, sum(Phoneme_DF[i, ]))
}
}
我希望我的问题有道理。谢谢你的帮助! :)
答案 0 :(得分:3)
我认为您希望输出关闭,Apple
应该只有2个辅音。试试这个:
library(tidyverse)
Words_DF %>%
gather(value, key, -word) %>%
left_join(Phoneme_DF, by = c("key" = "phoneme")) %>%
group_by(word) %>%
mutate(consonants_in_word = sum(is_consonant, na.rm = TRUE),
labials_in_word = sum(is_labial, na.rm = TRUE)) %>%
distinct(word, .keep_all = TRUE) %>%
select(word, consonants_in_word, labials_in_word)
返回:
# A tibble: 3 x 3
# Groups: word [3]
word consonants_in_word labials_in_word
<chr> <int> <int>
1 CAT 2 0
2 BAT 2 1
3 APPLE 2 1
这是我使用的数据:
Words_DF <- read.table(text = "word Phoneme1 Phoneme2 Phoneme3 Phoneme4
1 CAT K AE T Null
2 BAT B AE T Null
3 APPLE AE P AH L",
stringsAsFactors = FALSE, header = TRUE)
Phoneme_DF <- read.table(text = "phoneme is_consonant is_labial
1 AE 0 0
2 AH 0 0
3 B 1 1
4 K 1 0
5 T 1 0
6 P 1 1
7 L 1 0",
stringsAsFactors = FALSE, header = TRUE)
答案 1 :(得分:1)
我有data.table对应,对任何感兴趣的人:
Phoneme_DF[melt(Words_DF,id.vars = "word", value.name = "phoneme"), on = "phoneme"][
,lapply(.SD,function(x){sum(x,na.rm = TRUE)}),
.SDcols = c("is_consonant","is_labial"),by = word]
给出
word is_consonant is_labial
1: CAT 2 0
2: BAT 2 1
3: APPLE 2 1
程序类似于tyluRp提出的:你以长格式重塑wordDF数据表,将其与另一个连接起来,然后按字汇总辅音和标签的值。