寻找最简单的方法来获取矩阵,并转换为数据帧,其中每一行代表矩阵的唯一组合之一。
这个派上用场的地方有时我可能会创建类似距离矩阵的东西。但最终用户需要一个像布局一样的表格(例如在Excel中),以便他们可以过滤和查看各个场景以及它们的不同之处。
1)初始矩阵的外观如何
Honda Dodge Ferrari Honda 0 4 10 Dodge 4 0 10 Ferrari 10 10 0
2)我希望产生的输出(可接受)
vehicle1 vehicle2 distance 1 Honda Honda 0 2 Honda Dodge 4 3 Honda Ferrari 10 4 Dodge Honda 4 5 Dodge Dodge 0 6 Dodge Ferrari 10 7 Ferrari Honda 10 8 Ferrari Dodge 10 9 Ferrari Ferrari 0
3)我想要产生的输出(最佳情况)
此版本省略了重要的订单,并且不包括具有相同类型的vehicle1 / vehicle2(例如本田,本田,0)
vehicle1 vehicle2 distance 1 Honda Dodge 4 2 Honda Ferrari 10 3 Dodge Ferrari 10
重现的代码:
#This is just to set-up outputs for display
matrix_input = matrix(c(0,4,10,4,0,10,10,10,0), nrow=3)
colnames(matrix_input) = c('Honda','Dodge','Ferrari')
rownames(matrix_input) = c('Honda','Dodge','Ferrari')
dataframe_output = data.frame(vehicle1=c("Honda","Honda", "Honda",
"Dodge","Dodge", "Dodge",
"Ferrari","Ferrari", "Ferrari"),
vehicle2=c("Honda","Dodge", "Ferrari",
"Honda","Dodge", "Ferrari",
"Honda","Dodge", "Ferrari"),
distance=c(0,4,10,
4,0,10,
10,10,0))
dataframe_output.best_case = data.frame(vehicle1=c("Honda","Honda","Dodge"),
vehicle2=c("Dodge","Ferrari","Ferrari"),
distance=c(4,10,10))
#(1) initial matrix format
print(matrix_input)
#(2) desired output1 (acceptable)
print(dataframe_output)
#(3) desired output2 (best case)
#Ideally, I would like the operation to only pull unique
# combinations (where order does not matter) AND exclude same values (e.g. Honda,Honda)
print(dataframe_output.best_case)
答案 0 :(得分:0)
这可能不是最好的解决方案(我确信它在谷仓周围很远)。希望可能有一个很好的1或2行代码或一些我可以利用的现有包,但最终使用下面的代码完成它。如果有人以更简单的方式进入,我会全力以赴。
#Summary:
#This function takes a square matrix as input
# and returns a dataframe with all 'true' combinations.
#Notes:
#(1) Loops through each row, excluding the last row.
# Once we get to last row, all combinations will have been covered.
#(2) For each row, we start at the column +1 to right of the matrix diagonal.
# Everything to left of diagonal (per row) will have already been covered.
# Everything ON the diagonal will be comparing to itself, which we don't need.
m_comb_to_df = function(m, cat1, cat2, val_type) #matrix combinations to dataframe
{
#Calc total combinations (this will be total values right of diagonal in matrix)
#For example, if 4x4 matrix, then total combinations will be 3+2+1
#Formula for this is ((n-1)^2+(n-1))/2
comb = (((nrow(m)-1)^2)+(nrow(m)-1))/2
#create new dataframe for storing matrix combinations
df = data.frame(rep(NA, comb), rep(NA, comb), rep(NA, comb))
colnames(df)=c(cat1, cat2, val_type)
dfr = 1 #dataframe row counter (start at first row)
for(r in 1:(nrow(m)-1)) #loop through each row (except last)
{
for(c in (r+1):ncol(m)) #loop through columns, starting at right of diagonal (r+1)
{
#print(paste(r,c,r+1)) #debug
#store a single combination in current row (dfr) of dataframe
df[[cat1]][dfr] = rownames(m)[r] #store 'current' matrix row name
df[[cat2]][dfr] = colnames(m)[c] #store 'current' matrix column name
df[[val_type]][dfr] = m[r,c] #store 'current' matrix value
dfr = dfr + 1
}
}
return(df)
}
#matrix for testing
matrix_input = matrix(c(0,4,10,4,0,10,10,10,0), nrow=3)
colnames(matrix_input) = c('Honda','Dodge','Ferrari')
rownames(matrix_input) = c('Honda','Dodge','Ferrari')
#test function
m_comb_to_df(matrix_input, "car1", "car2", "distance")
答案 1 :(得分:0)
这里是使用reshap2包中的melt解决此问题的较短版本:
# load magrittr to use the pipe operator.
library(magrittr)
# remove duplications (this makes sure every pair only appears once)
matrix_input[upper.tri(matrix_input)] <- NA
# melt the data.frame
df <- reshape2::melt(matrix_input, na.rm = TRUE)
# get rid of the zeros and rename variables
df %>%
dplyr::filter(!(value == 0)) %>%
dplyr::rename(vehicle1 = Var1,
vehicle2 = Var2,
distance = value)