我有这个df:
df = pd.DataFrame({"on": [1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0]},
index=pd.date_range(start = "2020-04-09 6:45", periods = 30, freq = '8H'))
,并希望为df['on']
列创建每周配置文件。
我可以如下输入工作日和时间:
df['day_name'] = df.index.day_name()
df['time'] = df.index.time
并设置此df:
on day_name time
2020-04-09 06:45:00 1 Thursday 06:45:00
2020-04-09 14:45:00 1 Thursday 14:45:00
2020-04-09 22:45:00 1 Thursday 22:45:00
2020-04-10 06:45:00 1 Friday 06:45:00
2020-04-10 14:45:00 1 Friday 14:45:00
2020-04-10 22:45:00 0 Friday 22:45:00
2020-04-11 06:45:00 0 Saturday 06:45:00
2020-04-11 14:45:00 0 Saturday 14:45:00
2020-04-11 22:45:00 1 Saturday 22:45:00
2020-04-12 06:45:00 0 Sunday 06:45:00
2020-04-12 14:45:00 0 Sunday 14:45:00
2020-04-12 22:45:00 1 Sunday 22:45:00
2020-04-13 06:45:00 1 Monday 06:45:00
2020-04-13 14:45:00 0 Monday 14:45:00
2020-04-13 22:45:00 0 Monday 22:45:00
2020-04-14 06:45:00 0 Tuesday 06:45:00
2020-04-14 14:45:00 0 Tuesday 14:45:00
2020-04-14 22:45:00 1 Tuesday 22:45:00
2020-04-15 06:45:00 0 Wednesday 06:45:00
2020-04-15 14:45:00 1 Wednesday 14:45:00
2020-04-15 22:45:00 1 Wednesday 22:45:00
2020-04-16 06:45:00 0 Thursday 06:45:00
2020-04-16 14:45:00 0 Thursday 14:45:00
2020-04-16 22:45:00 0 Thursday 22:45:00
2020-04-17 06:45:00 1 Friday 06:45:00
2020-04-17 14:45:00 1 Friday 14:45:00
2020-04-17 22:45:00 1 Friday 22:45:00
2020-04-18 06:45:00 0 Saturday 06:45:00
2020-04-18 14:45:00 0 Saturday 14:45:00
2020-04-18 22:45:00 0 Saturday 22:45:00
有人可以帮助我如何获取某个时间段(例如,星期二22:45)df['on'] == 1
列的概率吗?这最好是整个一周的课程。
(在本例中,星期四22:45的概率是:1/2)
非常感谢:)
答案 0 :(得分:2)
我考虑了您的问题中的两个选择:
1。每周的比例:
我计算了某天的'on'==1
列的比率(概率):
每个工作日的比率:
df_2=pd.DataFrame()
df_2['Ones']=df[df['on']==1]['day_name'].value_counts()
df_2['All']=df['day_name'].value_counts()
df_2['Ratio']=df_2['Ones']/df_2['All']
df_2
这是输出:
Ones All Ratio
Friday 5 6 0.833333
Thursday 3 6 0.500000
Wednesday 2 3 0.666667
Monday 1 3 0.333333
Saturday 1 6 0.166667
Sunday 1 3 0.333333
Tuesday 1 3 0.333333
2。每天每次的比率:
在这里,我计算了在“ x”的第“ y”天的第“ x”天,列"on"
为1:
每周工作日的比率:
df_3 = df.groupby(['day_name', 'time']).agg({'on': 'count'})
df_3['ones'] = df.groupby(['day_name', 'time']).agg({'on': 'sum'})
df_3['Ratio'] = df_3['ones']/df_3['on']
df_3
这是输出:
on ones Ratio
day_name time
Friday 06:45:00 2 2 1.0
14:45:00 2 2 1.0
22:45:00 2 1 0.5
Monday 06:45:00 1 1 1.0
14:45:00 1 0 0.0
22:45:00 1 0 0.0
Saturday06:45:00 2 0 0.0
14:45:00 2 0 0.0
22:45:00 2 1 0.5
Sunday 06:45:00 1 0 0.0
14:45:00 1 0 0.0
22:45:00 1 1 1.0
Thursday06:45:00 2 1 0.5
14:45:00 2 1 0.5
22:45:00 2 1 0.5
Tuesday 06:45:00 1 0 0.0
14:45:00 1 0 0.0
22:45:00 1 1 1.0
Wednesday06:45:00 1 0 0.0
14:45:00 1 1 1.0
22:45:00 1 1 1.0
要回答您的请求,我必须对上面的代码做一些修改:我需要按要求对索引进行排序,将它们合并为一个索引,然后将索引转换为字符串,以避免绘图中出现一些问题。这是新代码:
#Ratio每周的日期和时间
days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
df_3 = df.groupby(['day_name', 'time']).agg({'on': 'count'})
df_3['ones'] = df.groupby(['day_name', 'time']).agg({'on': 'sum'})
df_3['Ratio'] = df_3['ones']/df_3['on']
df_3 = df_3.reindex(days, level=0)
df_3.index = [str(i) for i in (df_3.index.map('{0[0]} : {0[1]}'.format))]
df_3
现在我们进行了前面的验证,我们可以轻松地绘制比率:
#Graph的概率
import matplotlib.pyplot as plt
plt.figure()
plt.plot(df_3.index, df_3['Ratio'])
plt.xlabel('Date')
plt.xticks(rotation=90)
plt.title('Probability of "on"=1')
以下是图形:
答案 1 :(得分:1)
我相信您只需要path = f'D:\\YT_Files\\{video_title}.mp3'
:
mean