Scala Spark IndexedRowMatrix返回错误的行数

时间:2017-05-23 13:35:21

标签: scala apache-spark

我使用Spark 2.1.0和Scala 2.11.2。我已经从IndexedRowRDD创建了一个带有4行和6列的IndexedRowMatrix。当我打印矩阵的行时,我得到了这个输出:

IndexedRow(4,[1.0,0.0,0.0,1.0,0.0,0.0])
IndexedRow(2,[1.0,0.0,0.0,0.0,1.0,0.0])
IndexedRow(3,[0.0,0.0,1.0,0.0,0.0,0.0])
IndexedRow(1,[0.0,1.0,0.0,1.0,0.0,1.0])

当我打印行数时,结果为5.为什么会发生这种情况?

1 个答案:

答案 0 :(得分:2)

这是因为矩阵的索引是0. Spark假定输入正确,行数是max(index)+ 1.