Spark Scala中的歧义架构

时间:2018-08-29 21:37:29

标签: scala apache-spark

模式:

|-- c0: string (nullable = true)
|-- c1: struct (nullable = true)
|    |-- c2: array (nullable = true)
|    |    |-- element: struct (containsNull = true)
|    |    |    |-- orangeID: string (nullable = true)
|    |    |    |-- orangeId: string (nullable = true)

我试图在火花中展平上面的架构。

代码:

var df = data.select($"c0",$"c1.*").select($"c0",explode($"c2")).select($"c0",$"col.orangeID", $"col.orangeId")

展平代码工作正常。问题出在最后一部分,其中两列的区别仅在于1个字母(orangeID和orangeId)。因此,我收到此错误:

错误:

org.apache.spark.sql.AnalysisException: Ambiguous reference to fields StructField(orangeID,StringType,true), StructField(orangeId,StringType,true);

任何避免这种歧义的建议都会很棒。

1 个答案:

答案 0 :(得分:3)

打开spark sql区分大小写的配置,然后尝试

private static void Main(string[] cmdArgs)
{
    var signText = "If our road signs\nCatch your eye\nSmile\nBut don't forget\nTo buy\nBurma shave";

    var splitText = signText.Split('\n');
    var signWidth = splitText.Max(line => line.Length) + 2;

    // Write a sign with all the lines on one sign
    WriteSign(signText, signWidth);

    // Write a divider to separate the first sign from the rest
    Console.WriteLine(new string('-', Console.WindowWidth));

    // Write a separate sign for each line
    foreach (var line in splitText) WriteSign(line, signWidth);

    GetKeyFromUser("\nDone! Press any key to exit...");
}