sqoop import
--connect jdbc:mysql://localhost/classicmodels
--username root --password cloudera
--query ' select c.customernumber, c.customername, o.orderdate, o.ordernumber from customers AS c JOIN orders As o ON c.customernumber = o.customernumber WHERE $CONDITIONS '
--boundary-query 'select min(customernumber), max(customernumber) from customers '
--target-dir /data/info/customerdata/join
--split-by customernumber ;
mysql> describe customers ;
+------------------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+---------------+------+-----+---------+-------+
| customerNumber | int(11) | NO | PRI | NULL | |
| customerName | varchar(50) | NO | | NULL | |
| contactLastName | varchar(50) | NO | | NULL | |
| contactFirstName | varchar(50) | NO | | NULL | |
| phone | varchar(50) | NO | | NULL | |
| addressLine1 | varchar(50) | NO | | NULL | |
| addressLine2 | varchar(50) | YES | | NULL | |
| city | varchar(50) | NO | | NULL | |
| state | varchar(50) | YES | | NULL | |
| postalCode | varchar(15) | YES | | NULL | |
| country | varchar(50) | NO | | NULL | |
| salesRepEmployeeNumber | int(11) | YES | MUL | NULL | |
| creditLimit | decimal(10,2) | YES | | NULL | |
+------------------------+---------------+------+-----+---------+-------+
mysql> describe orders ;
+----------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+-------+
| orderNumber | int(11) | NO | PRI | NULL | |
| orderDate | date | NO | | NULL | |
| requiredDate | date | NO | | NULL | |
| shippedDate | date | YES | | NULL | |
| status | varchar(15) | NO | | NULL | |
| comments | text | YES | | NULL | |
| customerNumber | int(11) | NO | MUL | NULL | |
+----------------+-------------+------+-----+---------+-------+
答案 0 :(得分:0)
尝试在查询的where子句中使用sql的tablename.column名称语法。
答案 1 :(得分:0)
sqoop import
--connect jdbc:mysql://localhost/classicmodels
--username root --password cloudera
--query 'select customers.customernumber, customers.customername,
orders.orderdate, orders.ordernumber FROM customers, orders WHERE
customers.customernumber = orders.customernumber AND $CONDITIONS'
--boundary-query 'select min(customernumber), max(customernumber) from customers'
--target-dir /data/info/customerdata/join
--split-by customers.customernumber ;
对于--boundary-query
,请确保customernumber应为数字列,且不应为null。