PySpark添加一些列saveAsTable后创建错误的表

时间:2019-05-21 06:56:00

标签: hive pyspark

配置文件上下文已创建。

from pyspark import HiveContext
hc = HiveContext(sc)

然后阅读csv

t2 = hc.read.csv(dict_path,header=True)

如果直接使用t2.write.saveAsTable('test0'),它将获得蜂巢中的除外表,一切正常。

现在我添加一些列,

from pyspark.sql import functions as F
from datetime import datetime
from pyspark.sql.functions import col, udf
from pyspark.sql.types import DateType
func =  udf (lambda x: datetime.strptime(x, '%m/%d/%Y %H:%M'), DateType())
newcol = t2.select('End_Date').rdd.flatMap(lambda x: datetime.strptime(x, '%m/%d/%Y %H:%M'))

t2 = t2.withColumn('start_date1', func('Start_Date'))
t2 = t2.withColumn('End_Date1', func('End_Date'))

t2 = t2.withColumn('end_date_month',F.date_format('End_Date1', 'yyy-MM'))
t2 = t2.withColumn('start_date_month',F.date_format('Start_Date1', 'yyy-MM'))

运行这些代码后,我可以看到dataFrame正确。

+-------+--------+---------------+--------------------+--------------+---------------+--------------------+------------+-------+-----------------+--------+-----------+----------+--------------+----------------+
|Trip_ID|Duration|     Start_Date|       Start_Station|Start_Terminal|       End_Date|         End_Station|End_Terminal|Bike_id|Subscription_Type|Zip_Code|start_date1| End_Date1|end_date_month|start_date_month|
+-------+--------+---------------+--------------------+--------------+---------------+--------------------+------------+-------+-----------------+--------+-----------+----------+--------------+----------------+
|   4576|      63|8/29/2013 14:13|South Van Ness at...|            66|8/29/2013 14:14|South Van Ness at...|          66|    520|       Subscriber|   94127| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4607|      70|8/29/2013 14:42|  San Jose City Hall|            10|8/29/2013 14:43|  San Jose City Hall|          10|    661|       Subscriber|   95138| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4130|      71|8/29/2013 10:16|Mountain View Cit...|            27|8/29/2013 10:17|Mountain View Cit...|          27|     48|       Subscriber|   97214| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4251|      77|8/29/2013 11:29|  San Jose City Hall|            10|8/29/2013 11:30|  San Jose City Hall|          10|     26|       Subscriber|   95060| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4299|      83|8/29/2013 12:02|South Van Ness at...|            66|8/29/2013 12:04|      Market at 10th|          67|    319|       Subscriber|   94103| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4927|     103|8/29/2013 18:54| Golden Gate at Polk|            59|8/29/2013 18:56| Golden Gate at Polk|          59|    527|       Subscriber|   94109| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4500|     109|8/29/2013 13:25|Santa Clara at Al...|             4|8/29/2013 13:27|    Adobe on Almaden|           5|    679|       Subscriber|   95112| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4563|     111|8/29/2013 14:02| San Salvador at 1st|             8|8/29/2013 14:04| San Salvador at 1st|           8|    687|       Subscriber|   95112| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4760|     113|8/29/2013 17:01|South Van Ness at...|            66|8/29/2013 17:03|South Van Ness at...|          66|    553|       Subscriber|   94103| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4258|     114|8/29/2013 11:33|  San Jose City Hall|            10|8/29/2013 11:35|         MLK Library|          11|    107|       Subscriber|   95060| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4549|     125|8/29/2013 13:52|     Spear at Folsom|            49|8/29/2013 13:55|Embarcadero at Br...|          54|    368|       Subscriber|   94109| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4498|     126|8/29/2013 13:23|    San Pedro Square|             6|8/29/2013 13:25|Santa Clara at Al...|           4|     26|       Subscriber|   95112| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4965|     129|8/29/2013 19:32|Mountain View Cal...|            28|8/29/2013 19:35|Mountain View Cal...|          28|    140|       Subscriber|   94041| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4557|     130|8/29/2013 13:57|   2nd at South Park|            64|8/29/2013 13:59|   2nd at South Park|          64|    371|       Subscriber|   94122| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4386|     134|8/29/2013 12:31|     Clay at Battery|            41|8/29/2013 12:33|     Beale at Market|          56|    503|       Subscriber|   94109| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4749|     138|8/29/2013 16:57|     Post at Kearney|            47|8/29/2013 16:59|     Post at Kearney|          47|    408|       Subscriber|   94117| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4242|     141|8/29/2013 11:25|  San Jose City Hall|            10|8/29/2013 11:27|  San Jose City Hall|          10|     26|       Subscriber|   95060| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   4329|     142|8/29/2013 12:11|      Market at 10th|            67|8/29/2013 12:14|      Market at 10th|          67|    319|       Subscriber|   94103| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   5097|     142|8/29/2013 22:21|   Steuart at Market|            74|8/29/2013 22:24|Harry Bridges Pla...|          50|    564|       Subscriber|   94115| 2013-08-29|2013-08-29|       2013-08|         2013-08|
|   5084|     144|8/29/2013 22:06|  Powell Street BART|            39|8/29/2013 22:08|       Market at 4th|          76|    574|       Subscriber|   94115| 2013-08-29|2013-08-29|       2013-08|         2013-08|
+-------+--------+---------------+--------------------+--------------+---------------+--------------------+------------+-------+-----------------+--------+-----------+----------+--------------+----------------+

但是当我在表配置单元中保存saveTable时,会得到错误的表。

错误的表格架构。

test
col (array)
item (string)

我想念什么?

0 个答案:

没有答案