示例JSON数据:
{"name": "dev","salary": 100,"occupation": "engg","address": "noida"}
{"name": "karthik","salary": 200,"occupation": "engg","address": "blore"}
Spark Java代码:
DataFrame df = sqlContext.read().json(jsonPath);
df.printSchema();
df.show(false);
输出:
root
|-- address: string (nullable = true)
|-- name: string (nullable = true)
|-- occupation: string (nullable = true)
|-- salary: long (nullable = true)
+-------+-------+----------+------+
|address|name |occupation|salary|
+-------+-------+----------+------+
|noida |dev |engg |10000 |
|blore |karthik|engg |20000 |
+-------+-------+----------+------+
列按字母顺序排列。 有没有办法保持自然秩序?
答案 0 :(得分:1)
您可以在阅读schema
时提供json
并保持订单。
StructType schema = DataTypes.createStructType(new StructField[] {
DataTypes.createStructField("name", DataTypes.StringType, true),
DataTypes.createStructField("salary", DataTypes.IntegerType, true),
DataTypes.createStructField("occupation", DataTypes.StringType, true),
DataTypes.createStructField("address", DataTypes.StringType, true)});
DataFrame df = sqlContext.read().schema(schema).json(jsonPath);
df.printSchema();
df.show(false);
答案 1 :(得分:1)
你有两个选择
更好的选择是在阅读输入时使用模式。