我有两个包含多个json文件的文件夹
第一个文件夹为/Users/aus10/Desktop/MLB_Data/Clean_Team_Data
第二个文件夹为/Users/aus10/Desktop/MLB_Data/Slate_Logs
第一个文件夹中的文件是30个json
文件
{
"Team": "ARI",
"Games": [
{
"Date": "2019-03-28",
"Opponent": "@ LA Dodgers",
"Results": "L",
"Score": "12-5",
"Line": 150,
"Over_Under": "O",
"Total": 7,
"Opponent_Score": 12,
"Team_Score": 5,
"Total_Score": 17,
"Home_Away": "A",
"players": []
},
{
"Date": "2019-03-29",
"Opponent": "@ LA Dodgers",
"Results": "W",
"Score": "5-4",
"Line": 155,
"Over_Under": "O",
"Total": 7,
"Team_Score": 5,
"Opponent_Score": 4,
"Total_Score": 9,
"Home_Away": "A",
"players": []
}]
第二个文件夹包含218个这样的json文件
[
{
"StatID": 2593242,
"TeamID": 4,
"PlayerID": 10002075,
"SeasonType": 1,
"Season": 2019,
"Name": "Colin Moran",
"Team": "PIT",
"Position": "3B",
"PositionCategory": "IF",
"Started": 1,
"InjuryStatus": null,
"GameID": 54207,
"OpponentID": 31,
"Opponent": "STL",
"Day": "2019-04-01T00:00:00",
"DateTime": "2019-04-01T13:05:00",
"HomeOrAway": "HOME",
"Games": 1,
"FantasyPoints": 12,
"AtBats": 3,
"Runs": 1,
"Hits": 2,
"Singles": 0,
"Doubles": 1,
"Triples": 0,
"HomeRuns": 1,
"RunsBattedIn": 3,
"BattingAverage": 0.667,
"Outs": 1,
"Strikeouts": 0,
"Walks": 2,
"HitByPitch": 0,
"Sacrifices": 0,
"SacrificeFlies": 0,
"GroundIntoDoublePlay": 0,
"StolenBases": 0,
"CaughtStealing": 0,
"OnBasePercentage": 0.8,
"SluggingPercentage": 2,
"OnBasePlusSlugging": 2.8,
"Wins": 0,
"Losses": 0,
"Saves": 0,
"InningsPitchedDecimal": 0,
"TotalOutsPitched": 0,
"InningsPitchedFull": 0,
"InningsPitchedOuts": 0,
"EarnedRunAverage": 0,
"PitchingHits": 0,
"PitchingRuns": 0,
"PitchingEarnedRuns": 0,
"PitchingWalks": 0,
"PitchingStrikeouts": 0,
"PitchingHomeRuns": 0,
"PitchesThrown": 0,
"PitchesThrownStrikes": 0,
"WalksHitsPerInningsPitched": 0,
"PitchingBattingAverageAgainst": 0,
"FantasyPointsFanDuel": 37.7,
"FantasyPointsDraftKings": 27,
"WeightedOnBasePercentage": 0.8,
"PitchingCompleteGames": 0,
"PitchingShutOuts": 0,
"PitchingOnBasePercentage": 0,
"PitchingSluggingPercentage": 0,
"PitchingOnBasePlusSlugging": 0,
"PitchingStrikeoutsPerNineInnings": 0,
"PitchingWalksPerNineInnings": 0,
"PitchingWeightedOnBasePercentage": 0
}]
我需要遍历第一个文件夹中的每个文件,并且如果第一个对象中的Date
和Team
与Day
和'Team匹配来自第二个文件夹中任何文件中的任何dict
中的内容,我想将该dict
附加到第一个players
中的dict
列表中,依此类推,直到完成第一个文件夹中的每个文件。我使用了一个嵌套的for循环,它只匹配一个日期2019-08-18
,我不确定为什么。我知道这不是最有效的解决方案,所以请随时提出一种更有效的解决方案。
这是代码
import json
import pandas as pd
import os
path_to_json = '/Users/aus10/Desktop/MLB_Data/Clean_Team_Data'
Game_logs_json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
path_to_json = '/Users/aus10/Desktop/MLB_Data/Slate_Logs'
FPTS_json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
for file in Game_logs_json_files:
for file_1 in FPTS_json_files:
with open('/Users/aus10/Desktop/MLB_Data/Clean_Team_Data/'+file+'') as json_file:
team_data = json.load(json_file)
with open('/Users/aus10/Desktop/MLB_Data/Slate_Logs/'+file_1+'') as json_file:
fantasy_data = json.load(json_file)
for obj in team_data['Games']:
for player in fantasy_data:
if player['Day'].split('T')[0] == obj['Date'] and player['Team'] == team_data['Team']:
obj['players'].append(player)
with open('/Users/aus10/Desktop/MLB_Data/Game_Logs_With_Player_Data/'+file+'', 'w') as my_file:
json.dump(team_data, my_file)
答案 0 :(得分:1)
注意:我没有处理创建日期检查所需要的日期格式(我假设您将相应地更改代码)。
这只是解决问题的有效方法。
创建字典。 dict_players = {}
遍历包含播放器数据的所有文件。 遍历玩家,对每个玩家执行以下操作
for player in players:
k = date + '%' + team_name
if dict_players.has_key(k):
dict_players[k].append(player)
else:
dict_players[k] = [player]
现在dict_players词典将包含日期和球队名称组合的球员列表。 (date + '%' + team_name
)。这正是我们浏览团队数据文件时所需的。
因此,现在我们将浏览游戏数据,但是对于每个游戏的团队和日期组合,我们已经在字典中找到了玩家列表(dict_players)。我们需要做的就是访问它。
for game in games:
game['players'] = dict_players[game['date'] + '%' + team]
使用此方法,您只需要遍历每个文件一次。与嵌套循环相比,这可以节省大量时间。
答案 1 :(得分:1)
尝试一下:
import json
import pandas as pd
import os
path_to_json = '/Users/aus10/Desktop/MLB_Data/Clean_Team_Data'
Game_logs_json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
path_to_json = '/Users/aus10/Desktop/MLB_Data/Slate_Logs'
FPTS_json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')]
for file in Game_logs_json_files:
with open('/Users/aus10/Desktop/MLB_Data/Clean_Team_Data/'+file+'') as json_file:
team_data = json.load(json_file)
for file_1 in FPTS_json_files:
with open('/Users/aus10/Desktop/MLB_Data/Slate_Logs/'+file_1+'') as json_file:
fantasy_data = json.load(json_file)
for obj in team_data['Games']:
for player in fantasy_data:
if player['Day'].split('T')[0] == obj['Date'] and player['Team'] == team_data['Team']:
obj['players'].append(player)
with open('/Users/aus10/Desktop/MLB_Data/Game_Logs_With_Player_Data/'+file+'', 'w') as my_file:
json.dump(team_data, my_file)