我正在尝试将内存中的JSON 字符串快速读入Spark DataFrame中:
$fabric
我花了很多时间查看Spark API,我能找到的最好的就是像function cart(){
global $conn;
// build the fabric dropdown option tags once
// use as many times as you have a row ro put them in
$fabric_options = '';
$query = "SELECT * FROM almofadas";
$result = mysqli_query($conn,$query2);
while($rows = mysqli_fetch_assoc($run_item2)){
// oh you will need a value in value=""
// or this wont be any use to you later
$fabric_options .= "<option value='{$row['A_id']}'>{$rows['tecido']}</option>";
}
foreach ($_SESSION as $name => $value) {
if($value > 0){
if(substr($name, 0, 8 ) == "product_"){
$length = strlen($name) -8;
$item_id = substr($name,8 , $length);
$query = "SELECT *
FROM gallery2
WHERE gallery2.id =".escape_string($item_id). "";
$run_item = mysqli_query($conn,$query);
while($rows = mysqli_fetch_assoc($run_item)){
$vari = $rows['variante'];
$num = $rows['title'];
$id = $rows['id'];
$btn_add='<a class="btn btn-success" href="cart.php?add='.$id.'"><i class="fa fa-plus fa-lg" aria-hidden="true" add_btn></i></a>';
$btn_remove = '<a class="btn btn-warning" href="cart.php?remove='.$id.'"><i class="fa fa-minus fa-lg" aria-hidden="true" remove_btn></i></a>';
$btn_delete='<a class="btn btn-default delete_btn" href="cart.php?delete='.$id.'"><i class="fa fa-times fa-lg" aria-hidden="true"></i></a>';
if($rows['variante'] < 1){
$vari="";
}else{
$vari = "-".$rows['variante'];
}
// now concatenate the $fabric_options string
// in between this string after the select
$product = '
<td style="width:100px; "><img src="../'.$rows['image'].'" style="width:90%;border: 1px solid black;"></td>
<td>'.$num.''.$vari.'</td>
<td>
<select name="" class="form-control selectpicker" required="">'
. $fabric_options . '
</select>
</td>
<td>'.$value.'</td>
<td>R$100,00</td>
<td>sub.total</td>
<td>
'.$btn_add.' '.$btn_remove.' '.$btn_delete.'
</td>
</tr>';
echo $product;
}
}
}
}
}
?>
这样使用:
var someJSON : String = getJSONSomehow()
val someDF : DataFrame = magic.convert(someJSON)
但这感觉有些尴尬/不稳定,并施加以下限制:
所以我问:有没有一种直接且更有效的方法将JSON字符串转换为Spark DataFrame?
答案 0 :(得分:8)
来自Spark SQL指南:
val otherPeopleRDD = spark.sparkContext.makeRDD(
"""{"name":"Yin","address":{"city":"Columbus","state":"Ohio"}}""" :: Nil)
val otherPeople = spark.read.json(otherPeopleRDD)
otherPeople.show()
这将从中间RDD创建一个DataFrame(通过传递String创建)。