我有一个send_request()
函数,它以csv格式提取数据。
from pull_data import send_request
from openpyxl import load_workbook
from datetime import datetime
def max_nb_rows(season_slug, urls_ws):
date = datetime.today().strftime('%Y%m%d')
for i in range(100):
d = []
response = send_request(season_slug, url_ws, date).content
df = pd.read_csv(io.StringIO(response.decode('utf-8')))
max_nb_rows('2015-2016-regular', 'cumulative-player-stats')
我希望连续100天找到df
的最大行数。我怎么能这样做?
答案 0 :(得分:1)
我认为您需要先从今天开始过滤,然后按value_counts
获取最大数量,这是排序输出,因此需要iat
选择的第一个值:
def max_nb_rows(season_slug, urls_ws):
lens = []
date = datetime.today().strftime('%Y-%m-%d')
for i in range(100):
d = []
response = send_request(season_slug, url_ws, date).content
df = pd.read_csv(io.StringIO(response.decode('utf-8')))
lens.append(df.loc[df['Date'] > date, 'Date'].value_counts().iat[0])
return max(lens)
max_nb_rows('2015-2016-regular', 'cumulative-player-stats')
样品:
df = pd.DataFrame({'A':list('abcdef'),
'B':[4,5,4,5,5,4],
'C':[7,8,9,4,2,3],
'D':[1,3,5,7,1,0],
'E':[5,3,6,9,2,4],
'Date':pd.to_datetime(['2017-10-21','2017-10-21','2017-10-21','2017-10-25','2017-10-25','2017-10-28'])})
print (df)
A B C D Date E
0 a 4 7 1 2017-10-21 5
1 b 5 8 3 2017-10-21 3
2 c 4 9 5 2017-10-21 6
3 d 5 4 7 2017-10-25 9
4 e 5 2 1 2017-10-25 2
5 f 4 3 0 2017-10-28 4
date = datetime.today().strftime('%Y-%m-%d')
a = df.loc[df['Date'] > date, 'Date'].value_counts().iat[0]
print (a)
2
详情:
print (df.loc[df['Date'] > date, 'Date'])
3 2017-10-25
4 2017-10-25
5 2017-10-28
Name: Date, dtype: datetime64[ns]
print (df.loc[df['Date'] > date, 'Date'].value_counts())
2017-10-25 2
2017-10-28 1
Name: Date, dtype: int64