Question

我对Spark和数据框架都很陌生，我很想在互联网上搜索如何使用数据框（Spark-Java）将数据插入到mysql表中。我在scala上发现了很多东西，但是关于Java的信息很少。

我按照链接http://www.sparkexpert.com/2015/04/17/save-apache-spark-dataframe-to-database/中提供的步骤操作。它看起来很简单，但是当我自己尝试时，我遇到了创建正确的数据帧架构和使用 autoincrement 字段在表中插入数据的问题。

mySql表（人）架构

+------------+-------------+------+-----+---------+----------------+
| Field      | Type        | Null | Key | Default | Extra          |
+------------+-------------+------+-----+---------+----------------+
| person_id  | int(11)     | NO   | PRI | NULL    | auto_increment |
| first_name | varchar(30) | YES  |     | NULL    |                |
| last_name  | varchar(30) | YES  |     | NULL    |                |
| gender     | char(1)     | YES  |     | NULL    |                |
+------------+-------------+------+-----+---------+----------------+

Java代码

DataFrame usersDf= sqlContext.jsonFile("data.json");                
    usersDf.printSchema();
    usersDf.insertIntoJDBC(MYSQL_CONNECTION_URL, "person", false);

data.json

{"person_id":null,"first_name":"Judith1","last_name":"knight1","gender":"M"}
{"person_id":null,"first_name":"Judith2","last_name":"knight2","gender":"F"}
{"person_id":null,"first_name":"Judith3","last_name":"knight3","gender":"M"}
{"person_id":null,"first_name":"Judith4","last_name":"knight4","gender":"M"}

当我运行上面的代码时，使用下面给出的模式创建Dataframe：

root
|-- first_name: string (nullable = true)
|-- gender: string (nullable = true)
|-- last_name: string (nullable = true)
|-- person_id: string (nullable = true)

而架构应该是

 root
|-- person_id: integer (nullable = false)
|-- first_name: string (nullable = true)
|-- last_name: string (nullable = true)
|-- gender: string (nullable = true)

因此，由于创建了错误的架构，我收到了以下错误。

java.sql.SQLException: Incorrect integer value: 'Judith1' for column 'person_id' at row 1

请让我知道如何解决这个问题。我知道只有一些小问题，但我找不到它。任何帮助都会非常感激。提前谢谢。

如何使用Java通过spark DataFrames将数据插入到mysql中

0 个答案: