Question

我正在尝试从NHL.com提取统计表并将它们转换为csv以便以后在excel中使用。我可以拉表，但是在将它们转换为csv时遇到问题。我发现了很多关于将json转换为csv的问题，但没有一个解决方案对我有用。一些解决方案使用了pandas，由于某种原因，它不断给我一个追溯错误。以下是转换为csv之前的代码。

import requests
import lxml.html
from pprint import pprint 
from sys import exit
import json
import csv
import datetime 
import dateutil.relativedelta


now = datetime.datetime.now()
one_month_ago = now + dateutil.relativedelta.relativedelta(months=-15)

today_date = now.strftime('%Y-%m-%d')
one_month_ago_date = one_month_ago.strftime('%Y-%m-%d')

url = 'http://www.nhl.com/stats/rest/individual/skaters/basic/game/skatersummary?cayenneExp=gameDate%3E=%22'+one_month_ago_date+'T04:00:00.000Z%22%20and%20gameDate%3C=%22'+today_date+'T03:59:59.999Z%22%20and%20gameLocationCode=%22H%22%20and%20gameTypeId=%222%22&factCayenneExp=shots%3E=1&sort=[{%22property%22:%22points%22,%22direction%22:%22DESC%22},{%22property%22:%22goals%22,%22direction%22:%22DESC%22},{%22property%22:%22assists%22,%22direction%22:%22DESC%22}]'
resp = requests.get(url).text
resp = json.loads(resp)

非常感谢任何帮助！

编辑：我尝试过的一些csv转换方法包括来自How can I convert JSON to CSV?的评分最高的答案。我在这里粘贴和格式化问题所以我只是提供了链接。

这是我尝试使用pandas时的输出。

Traceback (most recent call last):
File "NHL Data Scrape.py", line 1, in <module>
from pandas.io.json import json_normalize

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\pandas\__init__.py", line 13, in <module>
__import__(dependency)

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\__init__.py", line 142, in <module>
from . import add_newdocs

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\add_newdocs.py", line 13, in <module>
from numpy.lib import add_newdoc

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\lib\__init__.py", line 8, in <module>
from .type_check import *

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\lib\type_check.py", line 11, in <module>
import numpy.core.numeric as _nx

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\core\__init__.py", line 35, in <module>
from . import _internal  # for freeze programs

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\core\_internal.py", line 18, in <module>
from .numerictypes import object_

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\core\numerictypes.py", line 962, in <module>
_register_types()

File "C:\Users\Brett\AppData\Local\Programs\Python\Python36\lib\site-
packages\numpy\core\numerictypes.py", line 958, in _register_types
    numbers.Integral.register(integer)

AttributeError: module 'numbers' has no attribute 'Integral'


------------------
(program exited with code: 1)

Press any key to continue . . .

Answer 1

您可以使用json_normalize()中的pandas.io.json，例如：

In []:
from pandas.io.json import json_normalize

...
resp = requests.get(url).json()
json_normalize(resp, 'data')

Out[]:
     assists  faceoffWinPctg  gameWinningGoals  gamesPlayed  goals  otGoals   ...
0         31          0.0967                 2           41     20        1   ...
1         27          0.0000                 3           38     22        0   ...
2         35          0.5249                 4           41     14        2   ...
3         34          0.4866                 3           41     14        1   ...
...

Answer 2

您可以使用python的内置csv.DictWriter

resp = requests.get(url).json() # get response data in json

# resp['data'] is a list of dicts which contains players info.
# resp['data'][0].keys() is a dictionary keys. We'll use it for csv header.
with open('nhl_players.csv', 'w') as f:
    w = csv.DictWriter(f, resp['data'][0].keys())
    w.writeheader()
    w.writerows(resp['data'])

此处输出CSV文件https://www.dropbox.com/s/1mmprenx0eniflg/nhl_players.csv?dl=0

希望这有帮助。

使用Python将JSON数据转换为CSV

2 个答案: