将列表元素嵌套到Python中的数据框

时间:2019-11-10 01:26:14

标签: python pandas list

警告这个问题确实需要一个非标准的Python软件包x = np.stack((img1, img2, img3), axis = -1) 。我有一个包含3个元素的列表,列表中的每个元素都包含另一个包含2个元素的列表:nba_api数据帧和player数据帧。建议采用什么方法来达到以下预期结果:1个合并的team数据帧和1个合并的player数据帧?来自R背景,我将通过以下方法解决此问题:1.将team数据帧与players数据帧合并到team中,然后,2.使用joined_list行将结果绑定到一个数据帧中。我了解这对于许多有经验的Python用户来说可能是非常基本的,但是在这里进行了许多搜索之后,我很难受尝试找到正确的方法。

do.call(rbind, joined_list)

2 个答案:

答案 0 :(得分:1)

多一点阅读(和清楚)之后,我能够将代码的手动部分组合到for循环中,从而生成一个包含玩家数据的列表和一个包含团队数据的列表。然后,使用这篇文章:Concatenate a list of pandas dataframes together,我可以将playerteam列表组合到各自的数据框中。

## output player frames
i=0
df_out=[]
df_players=[]
for i in range(len(temp)):
    df_out = temp[i].get_data_frames()
    df_players.append(df_out[0])         # index 0 will always contain player frame

df_players = pd.concat(df_players)
print(df_players)

## output team frames
i=0
df_out=[]
df_team=[]
for i in range(len(temp)):
    df_out = temp[i].get_data_frames()
    df_team.append(df_out[1])            # index 1 will always contain team frame

df_team = pd.concat(df_team)
print(df_team)

答案 1 :(得分:1)

首先,祝贺您坚持并自己找到解决方案! :D

评论和提示

您可以直接遍历列表,不需要索引

lst_1 = [1, 2, 3, 4]

for i in range(len(lst_1)):
    print(i)

可以写为

lst_1 = [1, 2, 3, 4]

for item in lst_1:
    print(item)

List comprehensionsgenerator expressions很棒

奖金:请注意我对变量名所做的更改。有关Python样式的一般参考,请参见PEP 8

gameids = ['0021900001','0021900002','0021900012']

headers1 = {
    'Host': 'stats.nba.com',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
    'Accept': 'application/json, text/plain, */*',
    'Accept-Language': 'en-US,en;q=0.5',
    'Referer': 'https://stats.nba.com/',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
}

# store player and team results for each gameids as elements of list temp
temp = list()
for i in range(len(gameids)):
    temp.append(boxscoreadvancedv2.BoxScoreAdvancedV2(game_id = gameids[i], headers=headers1))

可以写为

game_ids = ['0021900001','0021900002','0021900012']

api_headers = {
    'Host': 'stats.nba.com',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
    'Accept': 'application/json, text/plain, */*',
    'Accept-Language': 'en-US,en;q=0.5',
    'Referer': 'https://stats.nba.com/',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
}

api_results = [boxscoreadvancedv2.BoxScoreAdvancedV2(game_id=curr_game_id, headers=api_headers) for curr_game_id in game_ids]

您要遍历同一件事两次

# output player frames
i=0
df_out=[]
df_players=[]
for i in range(len(temp)):
    df_out = temp[i].get_data_frames()
    df_players.append(df_out[0])         # index 0 will always contain player frame

df_players = pd.concat(df_players)
print(df_players)

# output team frames
i=0
df_out=[]
df_team=[]
for i in range(len(temp)):
    df_out = temp[i].get_data_frames()
    df_team.append(df_out[1])            # index 1 will always contain team frame

df_team = pd.concat(df_team)
print(df_team)

使用前两个技巧,我们将得出以下结论:

players_lst = []
team_lst = []

for curr_res in api_results:
    curr_dfs = curr_res.get_data_frames()
    players_lst.append(curr_dfs[0])
    team_lst.append(curr_dfs[1])

players_df = pd.concat(players_lst)
team_df = pd.concat(team_lst)

我的解决方案

在这里,为了清晰起见,将其略微细分了。

import pandas as pd
from nba_api.stats.endpoints.boxscoreadvancedv2 import BoxScoreAdvancedV2

game_ids = ['0021900001', '0021900002', '0021900012']

api_headers = {
    'Host': 'stats.nba.com',
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
    'Accept': 'application/json, text/plain, */*',
    'Accept-Language': 'en-US,en;q=0.5',
    'Referer': 'https://stats.nba.com/',
    'Accept-Encoding': 'gzip, deflate, br',
    'Connection': 'keep-alive',
}

# generator of results from the API
api_results = (BoxScoreAdvancedV2(game_id=curr_game_id, headers=api_headers) for curr_game_id in game_ids)

# generator of lists of DataFrames from the API results
# think of it like: [[Player DF, Team DF], [Player DF, Team DF], ...]
api_res_dfs = (curr_res.get_data_frames() for curr_res in api_results)

# unpacking the size 2 lists of DataFrames into 2 flat lists
# [[Player DF, Team DF], [Player DF, Team DF], ...] -> [Player DF, Player DF, ...], [Team DF, Team DF, ...]
# see https://stackoverflow.com/q/2921847/11301900 for more on the use of the asterisk (*)
players_tupe, team_tupe = zip(*api_res_dfs)

# concatenating the various DataFrames, exactly the same as in your original code
players_df = pd.concat(players_tupe)
team_df = pd.concat(team_tupe)

print(players_df)
print(team_df)

这取决于这样一个事实,不仅如您所指出的,玩家数据框始终在列表中始终排在第一位,而团队数据框始终在列表中排在第二位,而这些仅是 中的两项结果列表。


让我知道您是否有任何问题:)