熊猫数据框-字符串索引必须为整数-

时间:2020-06-19 08:58:14

标签: python pandas dataframe

我有以下JSON文件

    {"Global Quote": {"01. symbol": "MSFT", "02. open": "194.0000", "03. high": "196.4900", "04. low": 
    "194.0000", "05. price": "196.3200", "06. volume": "22966814", "07. latest trading day": "2020-06- 
     18", "08. previous close": "194.2400", "09. change": "2.0800", "10. change percent": "1.0708%"}}

我的字典还有另一个字典,这就是为什么我要执行以下操作:

f = open('data.json',)
meta = json.load(f)
data = meta['Global Quote']

结果如下:

{"01. symbol": "MSFT", "02. open": "194.0000", "03. high": "196.4900", "04. low": "194.0000", "05. price": "196.3200", "06. volume": "22966814", "07. latest trading day": "2020-06-18", "08. previous close": "194.2400", "09. change": "2.0800", "10. change percent": "1.0708%"}

很遗憾,无法直接转换为CSV,因为会显示以下错误:

df = pd.read_json('data.json')
df.to_csv('datatest.csv', encoding='utf-8-sig')
ValueError: If using all scalar values, you must pass an index 

但是,如果我尝试通过数据帧对其进行操作,则会出现以下错误:

response = requests.request("GET", url, headers=headers, params=querystring)

info = response.json()

with open('data.json', 'w') as fp:
    json.dump(info, fp)


f = open('data.json',)

meta = json.load(f)
data = meta['Global Quote']

df=pd.DataFrame(columns=['symbol','open','high','low','price','volume','latest trading day','previous close','change','change percent'])
for d,p in data.items():
    data_row = [float(p['1. symbol']),float(p['2. open']),float(p['3. high']),float(p['4. low']),float(p['5. price']),int(p['6. volume']),float(p['7. latest trading day']),int(p['8. previous close']),float(p['9. change']),int(p['10. change percent'])]
df =df.sort_values('symbol')

print(df)
df.to_csv('testfile.csv')
TypeError: string indices must be integers

是否有更好的方式将文件写入CSV文件? 非常感谢

1 个答案:

答案 0 :(得分:1)

您不必遍历所有数据项。

尝试在创建数据框时仅传递索引:

f = open('data.json',)

meta = json.load(f)
data = meta['Global Quote']
df = pd.DataFrame(data, index=[0])
df.to_csv('testfile.csv', index = False)

例如,如果您在JSON上具有多个值,则:

{"Global Quote": [{"01. symbol": "MSFT", "02. open": "194.0000", "03. high": "196.4900", "04. low": "194.0000", "05. price": "196.3200", "06. volume": "22966814", "07. latest trading day": "2020-06-18", "08. previous close": "194.2400", "09. change": "2.0800", "10. change percent": "1.0708%"},
                  {"01. symbol": "IBM", "02. open": "228.0000", "03. high": "196.4900", "04. low": "194.0000", "05. price": "196.3200", "06. volume": "22966814", "07. latest trading day": "2020-06-28", "08. previous close": "194.2400", "09. change": "2.0800", "10. change percent": "1.0708%"}]}

您可以这样做:

import json
f = open('data.json',)

meta = json.load(f)
data = meta['Global Quote']

newdf = pd.DataFrame()
for d in data:
    df = pd.DataFrame(d, index=[0])
    newdf = newdf.append(df)
newdf.to_csv('testfile.csv', index = False)

输出的csv将如下所示:

01. symbol,02. open,03. high,04. low,05. price,06. volume,07. latest trading day,08. previous close,09. change,10. change percent
MSFT,194.0000,196.4900,194.0000,196.3200,22966814,2020-06-18,194.2400,2.0800,1.0708%
IBM,228.0000,196.4900,194.0000,196.3200,22966814,2020-06-28,194.2400,2.0800,1.0708%