Question

我有一个需求，我需要找出最受欢迎的开始时间。以下是帮助我找到正确解决方案的代码。

import time
import pandas as pd
import numpy as np

# bunch of code comes
# here
# that help in reaching the following steps

df = pd.read_csv(CITY_DATA[selected_city])

# convert the Start Time column to datetime
df['Start Time'] = pd.to_datetime(df['Start Time'])

# extract hour from the Start Time column to create an hour column
df['hour'] = df['Start Time'].dt.hour

# extract month and day of week from Start Time to create new columns
df['month'] = df['Start Time'].dt.month

df['day_of_week'] = df['Start Time'].dt.weekday_name

# find the most popular hour
popular_hour = df['hour'].mode()[0]

这是我尝试运行此查询时得到的样本o / p

“ print（df ['hour']）”

0         15
1         17
2          8
3         13
4         14
5          9
6          9
7         17
8         16
9         17
10         7
11        17
Name: hour, Length: 300000, dtype: int64

使用时得到的o / p

print（type（df ['hour']））

<class 'pandas.core.series.Series'>

最流行的开始时间的值存储在Popular_hour中，该值等于“ 17”（正确的值）

但是我无法理解.mode（）[0]

的部分

.mode（）的作用是什么，为什么是[0]？

同样的概念是，无论数据类型如何，都可以计算出流行月份和一周中的流行日期

Answer 1

mode返回系列：

df.mode()
0    17
dtype: int64

从此，您通过致电获取第一项

df.mode()[0]
17

请注意，始终会返回一个Series，并且有时如果有mode的多个值，它们都会全部返回：

pd.Series([1, 1, 2, 2, 3, 3]).mode()
0    1
1    2
2    3
dtype: int64

您仍然会每次都取第一个值，而将其余的值丢弃。请注意，返回多种模式时，它们总是被排序。

有关更多信息，请阅读mode上的文档。

无法理解python中.mode（）的使用

1 个答案: