我有2个具有关系的平面数据集,我想嵌套(即将关系表转换为嵌套的JSON)
Sub yearTest()
Dim SrchRng As Range, cel As Range
Set SrchRng = Range("D1:D9")
For Each cel In SrchRng
If IsEmpty(cel) And Year(cel.Offset(0, -1)) = 2020 Then
cel.Offset(0, -2).Value = "Test"
End If
Next cel
End Sub
哪些印刷品:
from pyspark.context import SparkContext
from awsglue.context import DynamicFrame
sc = SparkContext.getOrCreate()
glueContext = GlueContext(sc)
# Create some sample data
TableA = spark.createDataFrame(
schema = ['name', 'a_id'],
data = [('Pirate',1),('Monkey',2)]
)
TableB = spark.createDataFrame(
schema = ['name', 'b_id', 'a_id'],
data = [('banana', 1, 1),('ball', 2, 1),('coffee', 3, 2),('plant', 4, 2)]
)
# wrap in Glue DynamicFrame
# note: pretend we started with DynamicFrames, since we're working with Glue ETL Jobs
dfA = DynamicFrame.fromDF(TableA, glueContext, "TableA")
dfA.toDF().show()
dfB = DynamicFrame.fromDF(TableB, glueContext, "TableB")
dfB.toDF().show()
我尝试过的方法-根据文档加入
+------+----+
| name|a_id|
+------+----+
|Pirate| 1|
|Monkey| 2|
+------+----+
+------+----+----+
| name|b_id|a_id|
+------+----+----+
|banana| 1| 1|
| ball| 2| 1|
|coffee| 3| 2|
| plant| 4| 2|
+------+----+----+
打印:
joined = Join.apply(dfA, dfB, 'a_id', 'a_id')
joined.toDF().show()
我想看到的是类似的东西
+----+------+----+------+-----+
|b_id| name|a_id| .name|.a_id|
+----+------+----+------+-----+
| 1|banana| 1|Pirate| 1|
| 2| ball| 1|Pirate| 1|
| 3|coffee| 2|Monkey| 2|
| 4| plant| 2|Monkey| 2|
+----+------+----+------+-----+
我想这是左联接,结果被分组了……但是不知道怎么做