我有一个像这样格式化的json对象
15/12/24 18:09:28 INFO SparkContext: Starting job: collect at <console>:42
15/12/24 18:09:28 INFO DAGScheduler: Got job 0 (collect at <console>:42) with 2 output partitions
15/12/24 18:09:28 INFO DAGScheduler: Final stage: ResultStage 0(collect at <console>:42)
15/12/24 18:09:28 INFO DAGScheduler: Parents of final stage: List()
15/12/24 18:09:28 INFO DAGScheduler: Missing parents: List()
15/12/24 18:09:28 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at map at <console>:38), which has no missing parents
15/12/24 18:09:28 INFO MemoryStore: ensureFreeSpace(11600) called with curMem=0, maxMem=560993402
15/12/24 18:09:28 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 11.3 KB, free 535.0 MB)
15/12/24 18:09:28 INFO MemoryStore: ensureFreeSpace(4540) called with curMem=11600, maxMem=560993402
15/12/24 18:09:28 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 4.4 KB, free 535.0 MB)
15/12/24 18:09:28 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.10.10.98:53386 (size: 4.4 KB, free: 535.0 MB)
15/12/24 18:09:28 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:861
15/12/24 18:09:28 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at map at <console>:38)
15/12/24 18:09:28 INFO YarnScheduler: Adding task set 0.0 with 2 tasks
15/12/24 18:09:28 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ip-10-10-10-217.ec2.internal, PROCESS_LOCAL, 2385 bytes)
15/12/24 18:09:28 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, ip-10-10-10-213.ec2.internal, PROCESS_LOCAL, 2385 bytes)
15/12/24 18:09:28 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-10-10-213.ec2.internal:56642 (size: 4.4 KB, free: 535.0 MB)
15/12/24 18:09:28 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ip-10-10-10-217.ec2.internal:56396 (size: 4.4 KB, free: 535.0 MB)
15/12/24 18:09:29 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, ip-10-10-10-217.ec2.internal): java.lang.NullPointerException
at org.apache.spark.sql.DataFrame.schema(DataFrame.scala:290)
at org.apache.spark.sql.DataFrame.columns(DataFrame.scala:306)
at $line34.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$prepRDD_buggy$1.apply(<console>:38)
at $line34.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$prepRDD_buggy$1.apply(<console>:38)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
在下面的代码中,当我打印出“数据”时,控制台告诉我我有{
"tweet":[
{"text": "hello world"},
{"text": "hello world"}
]
}
,但是当我打印出我的“点”值时,我将数据绑定到它上面说我的价值是Object tweet: Array[131]
。我做错了什么?
0: Array[1]
答案 0 :(得分:1)
正如评论所述,修复您的JSON,如下所示。我喜欢使用JSON验证器(如https://jsonformatter.curiousconcept.com/)来确认我有有效的JSON数据。此外,需要在您的Javascript中进行一些更改。请参阅下面的更新的Javascript代码。
JSON文件
{
"tweet":[
{"text":"hello world"},
{"text":"hello world"}
]
}
Javascript文件
d3.json("tweetsTest.json", function (error, data) {
if (error) return console.warn(error);
//tells me I have an `Object tweet: Array[131]`
console.log(data);
var dots = d3.select("svg")//modified so d3.select("svg") not just svg
.selectAll("circle")
.data(data.tweet)//modified, need data.tweet to access because you have root "tweet"
.enter()
.append("circle")
.attr("r", 5)//added r, cx, and cy
.attr("cx", function (d, i) {
return (i+1) * 20;
})//added
.attr("cy", function (d, i) {
return 20;
});//added
//says I have `0: Array[1]`
console.log(dots);
});
HTML文件
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="initial-scale=1,maximum-scale=1"/>
<script type="text/javascript" src="d3.js"></script>
<script type="text/javascript" src="d3_stackoverflow34456619.js"></script>
</head>
<body>
<svg style="width:500px;height:500px;border:1px lightgray solid;"></svg>
</body>
</html>