fillna(0)改变其他值

时间:2018-02-22 09:44:53

标签: python pyspark

这是frequency_creative数据帧

[('tags', 'int'), 
 ('user_id', 'bigint'), 
 ('processdate', 'date'), 
 ('brandsurvey_name', 'string'), 
 ('survey_id', 'string'), 
 ('questionid', 'string'), 
 ('questiontext', 'string'), 
 ('frequency', 'bigint'), 
 ('creative_id', 'string')]

当我使用fillna(0)时,某些user_id列的值会被损坏。但是,如果未使用fillna(0),则表明工作正常。

frequency_creative.select("user_id").distinct().show()

 +-------------------+
 |            user_id|
 +-------------------+
 |1665009053012894694|
 | 840031193618494976|
 +-------------------+


frequency_creative  =frequency_creative.select("processdate","tags","survey_id","brandsurvey_name","user_id","questionid","questiontext","frequency","creative_id").fillna(0)
frequency_creative.select("user_id").distinct().show()

after select
+-------------------+
|            user_id|
+-------------------+
|1665009053012894720|
| 840031193618494976|
+-------------------+
**********************************************
without fillna(0)


before select
+-------------------+
|            user_id|
+-------------------+
|1665009053012894694|
| 840031193618494976|
+-------------------+

frequency_creative  =frequency_creative.select("processdate","tags","survey_id","brandsurvey_name","user_id","questionid","questiontext","frequency","creative_id")
frequency_creative.select("user_id").distinct().show()

after select
+-------------------+
|            user_id|
+-------------------+
|1665009053012894694|
| 840031193618494976|
+-------------------+

0 个答案:

没有答案