假设我有一个configurations {
//second jar's configuration
addons
}
dependencies {
....
// sample dependency
addons group: 'org.apache.logging.log4j', name: 'log4j-api', version: '2.11.1'
}
task customJar(type: org.springframework.boot.gradle.tasks.bundling.BootJar){
baseName = 'custom-spring-boot'
version = '0.1.0'
mainClassName = 'hello.Application'
from {
// this is your second jar's configuration
configurations.addons.collect { it.isDirectory() ? it : zipTree(it) }
}
with bootJar
}
// add a dependency to create both jars with gradle bootJar Command
bootJar.dependsOn customJar
的模式:
org.apache.spark.sql.DataFrame
假设此DataFrame的典型实例如下所示:
root
|-- origin: string (nullable = true)
|-- destination: string (nullable = true)
我想将其转换为这样的DataFrame:
+------------------------------+--------------------------------------+
|origin |destination |
+------------------------------+--------------------------------------+
|JEBEL ALI |KUWAIT |
|CHITTAGONG |KEARNY POINT |
|FELIXSTOWE |KEARNY POINT |
|LOS ANGELES |EUROPOORT - E.C.T. DELTA TERMINAL |
|LOS ANGELES |KAOHSIUNG |
|GREATER NEW YORK TERMINAL |ANTWERP |
|SHANGHAI |LOS ANGELES |
|SAN PEDRO |BRANI TERMINAL - PULAU BRANI |
|KAMPONG SAOM |HOWLAND HOOK CONTAINER TERMINAL |
|SHANGHAI |LONG BEACH |
|BARCELONA |MONTREAL |
|HAIFA |GREATER NEW YORK TERMINAL |
|BRANI TERMINAL - PULAU BRANI |BUSAN |
|MUMBAI |KEARNY POINT |
|LAEM CHABANG |CAT LAI OIL TERMINAL - HO CHI MIN CITY|
|BARCELONA |JAWAHARLAL NEHRU PORT |
|HUANG DAO - OIL TERMINAL NO. 2|VANCOUVER, B.C. |
|HAIFA |HALIFAX |
|BRANI TERMINAL - PULAU BRANI |LOS ANGELES |
|MANILA |VANCOUVER, B.C. |
+------------------------------+--------------------------------------+
请注意,+------------------------------+---------------------------------------------------+
|origin |destinations |
+------------------------------+---------------------------------------------------+
|JEBEL ALI |[KUWAIT] |
|CHITTAGONG |[KEARNY POINT] |
|FELIXSTOWE |[KEARNY POINT] |
|LOS ANGELES |[EUROPOORT - E.C.T. DELTA TERMINAL, KAOHSIUNG] |
|GREATER NEW YORK TERMINAL |[ANTWERP] |
|SHANGHAI |[LOS ANGELES, [LONG BEACH] |
|SAN PEDRO |BRANI TERMINAL - PULAU BRANI |
|KAMPONG SAOM |HOWLAND HOOK CONTAINER TERMINAL |
|BARCELONA |[MONTREAL, JAWAHARLAL NEHRU PORT] |
|HAIFA |[GREATER NEW YORK TERMINAL, HALIFAX] |
|BRANI TERMINAL - PULAU BRANI |[BUSAN, LOS ANGELES] |
|MUMBAI |KEARNY POINT |
|LAEM CHABANG |CAT LAI OIL TERMINAL - HO CHI MIN CITY |
|HUANG DAO - OIL TERMINAL NO. 2|VANCOUVER, B.C. |
|MANILA |VANCOUVER, B.C. |
+------------------------------+---------------------------------------------------+
的每个值都是唯一的,并显示与该起点关联的所有目的地。 origin
的类型为Seq [String]。
我该怎么做?
答案 0 :(得分:1)
val originToDestinations = originDestinationDf.groupBy("origin").agg(collect_set("destination"))