Question

如您所知，DataFrame可以包含复杂类型的字段，如结构（StructType）或数组（ArrayType）。在我的例子中，您可能需要使用简单类型字段（String，Integer ...）将所有DataFrame数据映射到Hive表。我很长一段时间一直在努力解决这个问题，我终于找到了一个我想分享的解决方案。此外，我确信它可以改进，所以请随意回复你自己的建议。

它基于this thread，但也适用于ArrayType元素，而不仅仅是StructType元素。它是一个尾递归函数，它接收一个DataFrame，并将其返回展平。

private void textBox1_TextChanged(object sender, EventArgs e)
{
    if(!String.IsNullOrEmpty(textBox1.Text))
    {
        PopulateCombo(1);
    }
}

private void textBox2_TextChanged(object sender, EventArgs e)
{
    if(!String.IsNullOrEmpty(textBox2.Text))
    {
        PopulateCombo(2);
    }
}

private void textBox3_TextChanged(object sender, EventArgs e)
{
    if(!String.IsNullOrEmpty(textBox3.Text))
    {
        PopulateCombo(3);
    }
}

private void PopulateCombo(int textBoxID)
    {
        //With this you will get how many textBoxes have value
        int filledTextboxes = 0;
        if(!String.IsNullOrEmpty(textBox1.Text))
        {
            filledTextboxes++;
        }
        if (!String.IsNullOrEmpty(textBox2.Text))
        {
            filledTextboxes++;
        }
        if (!String.IsNullOrEmpty(textBox3.Text))
        {
            filledTextboxes++;
        }

        //With this you will run one code if only one textbox has value and other if more than one has value
        if(filledTextboxes == 1)
        {
            switch(textBoxID)
            {
                case 1:
                    comboBox1.Items.Clear();
                    comboBox1.Items.Add("TextBox1");
                    break;
                case 2:
                    comboBox1.Items.Clear();
                    comboBox1.Items.Add("TextBox2");
                    break;
                case 3:
                    comboBox1.Items.Clear();
                    comboBox1.Items.Add("TextBox3");
                    break;
            }
        }
        else
        {
            comboBox1.Items.Clear();
            MessageBox.Show(String.Format("All items cleared because there are {0} boxes with value", filledTextboxes));
        }
    }

Answer 1

val df = Seq（（“ 1”，（2，（3，4）），Seq（1,2）））。toDF（）

df.printSchema

root
 |-- _1: string (nullable = true)
 |-- _2: struct (nullable = true)
 |    |-- _1: integer (nullable = false)
 |    |-- _2: struct (nullable = true)
 |    |    |-- _1: integer (nullable = false)
 |    |    |-- _2: integer (nullable = false)
 |-- _3: array (nullable = true)
 |    |-- element: integer (containsNull = false)


def flattenSchema(schema: StructType, fieldName: String = null) : Array[Column] = {
   schema.fields.flatMap(f => {
     val cols = if (fieldName == null) f.name else (fieldName + "." + f.name)
     f.dataType match {
       case structType: StructType => fattenSchema(structType, cols)
       case arrayType: ArrayType => Array(explode(col(cols)))
       case _ => Array(col(cols))
     }
   })
 }

df.select（flattenSchema（df.schema）：_ *）。printSchema

root
 |-- _1: string (nullable = true)
 |-- _1: integer (nullable = true)
 |-- _1: integer (nullable = true)
 |-- _2: integer (nullable = true)
 |-- col: integer (nullable = false)

在Scala中展平DataFrame，其中包含不同的DataType

1 个答案: