删除spark中的列表格式

时间:2015-08-22 12:38:21

标签: scala apache-spark

我有一些像这样的火花代码:

val combPrdGrp = custPrdGrp3.join(cmpgnPrdGrp3)

val combPrdGrp2 = combPrdGrp.groupByKey


val combPrdGrp3 = combPrdGrp2.map{case (k3, vals3) => {
  val valsString3 = vals3.map{case (id3, m3) => {
     s"$id3 $m3"
     }
   }
   s"$k3 $valsString3"
 }}

当我执行combPrdGrp3.first时,我得到以下结果。

res1: String = 110| List( {'CNSMR_DIRCT_SAVG': {PRVCY_CALL: 1, PRVCY_SWP: 1, PRVCY_MAIL: 1, PRVCY_AFIL: 1, PRVCY_FCRA: 1, PRVCY_PIPE: 1, PRVCY_GLBA: 4}}|  {'CARDXSL1503L': {contacted: '3/25/2015', channel: 'CARD-XSL', hit_home_date: 'ASPEN - Reminder', campaign: 'XSELL TO 360', creative: 'EM', refcode: 'Y'}})

我希望移除List(及其结尾),但似乎无法弄清楚如何做到这一点。我尝试使用.pipe,但似乎没有这样做:

val combPrdGrp4 = combPrdGrp3.pipe("sed s/List((//g").pipe("sed s/)//g")

由于某种原因会崩溃sc。尝试处理结果时,我会收到sc shutdown错误。

执行combPrdGrp2.first会产生以下结果:

res2: (String, Iterable[(String, String)]) = (110|,CompactBuffer(( {'CNSMR_DIRCT_SAVG': {PRVCY_CALL: 1, PRVCY_SWP: 1, PRVCY_MAIL: 1, PRVCY_AFIL: 1, PRVCY_FCRA: 1, PRVCY_PIPE: 1, PRVCY_GLBA: 4}}|, {'CARDXSL1503L': {contacted: '3/25/2015', channel: 'CARD-XSL', hit_home_date: 'ASPEN - Reminder', campaign: 'XSELL TO 360', creative: 'EM', refcode: 'Y'}})))

1 个答案:

答案 0 :(得分:2)

您可以使用CompactBuffer自行设置List / mkString格式:

List(1, 2, 3).toString
// String = List(1, 2, 3)

List(1, 2, 3).mkString
// String = 123

List(1, 2, 3).mkString(", ")
// String = 1, 2, 3

如果你的combPrdGrp2可能如下:

val elem = (
  "110|", 
  Iterable((
    "{'CNSMR_DIRCT_SAVG': {PRVCY_CALL: 1, PRVCY_SWP: 1, PRVCY_MAIL: 1, PRVCY_AFIL: 1, PRVCY_FCRA: 1, PRVCY_PIPE: 1, PRVCY_GLBA: 4}}|",
    "{'CARDXSL1503L': {contacted: '3/25/2015', channel: 'CARD-XSL', hit_home_date: 'ASPEN - Reminder', campaign: 'XSELL TO 360', creative: 'EM', refcode: 'Y'}}"
  ))
)
val combPrdGrp2 = List(elem)

combPrdGrp2.map { case (n, list) => 
  val formattedPairs = list.map { case (a, b) => s"$a $b" }
  s"$n ${formattedPairs.mkString}"
}