我有一张表:
df<-data.frame(palabra=c('ani', 'anib', 'alop', 'alope','ber', 'beren'))
我需要为单词组创建一个距离矩阵,按照第一个字符分组。
为此我添加:
df$letra<-substring(df$palabra,1,1)
现在我需要为每个组应用adist功能。举一个adist的例子:
adist(df$palabra, costs=list(insertions=1, deletions=1, substitutions=2))
如何为每个组创建一个距离表?
答案 0 :(得分:3)
lapply
和split
的简单组合会让您想要:
#split is used to create two data frames; one for group a and one
#for groupb b
#lapply will apply the adist function to each of the groups
lapply(split(df, df$letra), function(x) {
adist(x$palabra, costs=list(insertions=1, deletions=1, substitutions=2))
})
输出:
$a
[,1] [,2] [,3] [,4]
[1,] 0 1 5 6
[2,] 1 0 6 7
[3,] 5 6 0 1
[4,] 6 7 1 0
$b
[,1] [,2]
[1,] 0 2
[2,] 2 0