尝试重命名数据帧列时遇到了奇怪的行为。
我有两个val,fail
和ok
,每个都代表完全相同的列表。
scala> fail
res136: Array[String] = Array(en, vi, ro, ur, lv, pl, pt, tl, in, ko, uk, cs, sr, tr, de, is, es, eu, el, it, ar, nl, bn, hu, iw, th, lt, no, fa, bg, cy, hi, et, zh, fr)
scala> ok
res137: Array[String] = Array(en, vi, ro, ur, lv, pl, pt, tl, in, ko, uk, cs, sr, tr, de, is, es, eu, el, it, ar, nl, bn, hu, iw, th, lt, no, fa, bg, cy, hi, et, zh, fr)
scala> fail.deep == ok.deep && fail.getClass == ok.getClass
res145: Boolean = true
事情是,当我尝试以下内容时,我开始质疑我的理智:
scala> cross.toDF("y"+:fail:_*).count
res142: Long = 1
scala> cross.toDF("y"+:ok:_*).count
res143: Long = 41
另外,那个:
scala> val exactlyTheSameDF = cross
exactlyTheSameDF: org.apache.spark.sql.DataFrame = [y_prediction: string, 0.0: bigint ... 34 more fields]
scala> exactlyTheSameDF.toDF("y"+:ok:_*).count
res150: Long = 41
scala> exactlyTheSameDF.toDF("y"+:fail:_*).count
res151: Long = 41
一定有一些我错过的东西,任何想法?
我非常感谢您提供的任何帮助