我有两个数据帧df1,df2,其架构如下:
DF1的格式为:
slotSize2: struct (nullable = true)
| |-- 120x600: struct (nullable = true)
| | |-- pView: string (nullable = true)
| |-- 160x600: struct (nullable = true)
| | |-- level: string (nullable = true)
| | |-- pATF: string (nullable = true)
| | |-- pView: string (nullable = true)
| | |-- pViewV1: string (nullable = true)
| | |-- sPos: string (nullable = true)
| |-- 250x250: struct (nullable = true)
| | |-- pView: string (nullable = true)
| |-- 300x250: struct (nullable = true)
| | |-- level: string (nullable = true)
| | |-- pATF: string (nullable = true)
| | |-- pView: string (nullable = true)
| | |-- pViewV1: string (nullable = true)
| | |-- sPos: string (nullable = true)
Dataframe df2有架构:
root
|-- bidId: array (nullable = true)
| |-- element: string (containsNull = true)
|-- slotSize1: array (nullable = true)
| |-- element: string (containsNull = true)
在数据帧df2中,我们以字符串的形式提供slotSize(命名为slotSize1),在数据帧df1中我们有slotize的嵌套形式,即对于每个slotize我们都有相应的映射。
我想加入两个数据帧df1,df2以形成一个新的数据帧df3,其中包含架构(bidId,slotSize,viewMap),其中bidId存在于df1中,slotSIze的格式为120x600,并且存在于两个架构中,viewMap对应到df1中每个slotSize对应的嵌套映射。