Pyspark-识别白天还是黑夜

时间:2020-10-18 21:24:35

标签: pyspark apache-spark-sql pyspark-dataframes

我的数据框如下:

+--------------------+---------------------+-------------+------------+
|tpep_pickup_datetime|tpep_dropoff_datetime|trip_distance|total_amount
+--------------------+---------------------+-------------+------------+
| 2019-01-01 08:53:20|  2019-01-01 09:01:00|          1.5|        2.00|
| 2019-01-01 21:18:59|  2019-01-01 21:59:59|          2.6|        5.00|
| 2019-01-01 08:53:20|  2019-01-01 10:01:00|          1.5|        2.00|
| 2019-01-01 21:18:59|  2019-01-01 22:59:59|          2.6|        5.00|
+--------------------+---------------------+-------------+------------+

我需要创建一个表来计算所有夜间和白天旅行的trip_rate(总金额/ trip_distance),以便最终结果如下所示:


+------------+-----------+
| day_night  | trip_rate |
+------------+-----------+
|Day         | 1.33      |
|Night       | 1.92      |
+------------+-----------+

鉴于以下情况,我在尝试区分夜晚与白天时遇到了麻烦:

    day_night will have 'Day' or 'Night':
        - From 9am to 8:59:59pm - Day
        - From 9pm to 8:59:59am - Night

0 个答案:

没有答案