找到最大行数

时间:2017-10-23 12:02:27

标签: python pandas

我有一个send_request()函数,它以csv格式提取数据。

from pull_data import send_request
from openpyxl import load_workbook
from datetime import datetime

def max_nb_rows(season_slug, urls_ws):
    date = datetime.today().strftime('%Y%m%d')
    for i in range(100):
        d = []
        response = send_request(season_slug, url_ws, date).content
        df = pd.read_csv(io.StringIO(response.decode('utf-8')))
max_nb_rows('2015-2016-regular', 'cumulative-player-stats')

我希望连续100天找到df的最大行数。我怎么能这样做?

1 个答案:

答案 0 :(得分:1)

我认为您需要先从今天开始过滤,然后按value_counts获取最大数量,这是排序输出,因此需要iat选择的第一个值:

def max_nb_rows(season_slug, urls_ws):
    lens = []
    date = datetime.today().strftime('%Y-%m-%d')
    for i in range(100):
        d = []
        response = send_request(season_slug, url_ws, date).content
        df = pd.read_csv(io.StringIO(response.decode('utf-8')))
        lens.append(df.loc[df['Date'] > date, 'Date'].value_counts().iat[0])
    return max(lens)
max_nb_rows('2015-2016-regular', 'cumulative-player-stats')

样品:

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4,5,4,5,5,4],
                   'C':[7,8,9,4,2,3],
                   'D':[1,3,5,7,1,0],
                   'E':[5,3,6,9,2,4],
                   'Date':pd.to_datetime(['2017-10-21','2017-10-21','2017-10-21','2017-10-25','2017-10-25','2017-10-28'])})

print (df)
   A  B  C  D       Date  E
0  a  4  7  1 2017-10-21  5
1  b  5  8  3 2017-10-21  3
2  c  4  9  5 2017-10-21  6
3  d  5  4  7 2017-10-25  9
4  e  5  2  1 2017-10-25  2
5  f  4  3  0 2017-10-28  4
date = datetime.today().strftime('%Y-%m-%d')
a = df.loc[df['Date'] > date, 'Date'].value_counts().iat[0]
print (a)
2

详情:

print (df.loc[df['Date'] > date, 'Date'])
3   2017-10-25
4   2017-10-25
5   2017-10-28
Name: Date, dtype: datetime64[ns]

print (df.loc[df['Date'] > date, 'Date'].value_counts())
2017-10-25    2
2017-10-28    1
Name: Date, dtype: int64