Question

我是python的新手。我正在编写一个脚本来从网站上提取一些数据并绘制图表。但是，我的代码错误，说数据类型不正确。具体来说，我有'值'的十进制值和'年'的日期。我试图重新定义它们，但我认为我将定义放在了错误的位置。任何帮助将不胜感激，代码如下。

import numpy as np
import pandas as pd
import json
import matplotlib.pyplot as mp
from IPython.display import HTML
import getpass
import requests

def frame(url, height=400, width=100):
    display_string = '<frame src={url} width={w} height={h}>
                      </iframe>'.format(url=url, w=width, h=height)
    return HTML(display_string)

frame('https://data.bls.gov/registrationEngine/')
registration_key = getpass.getpass('Enter Registration Key: ')

series = 'MPU4900012'

frame('https://api.bls.gov/publicAPI/v1/timeseries/data/')

def capture_series(series, start, end, key=registration_key):
    url = 'https://api.bls.gov/publicAPI/v2/timeseries/data/'
    url += '?registrationkey={key}'.format(key=key)

    data = json.dumps({
        "seriesid": [series],
        "startyear": str(start),
        "endyear": str(end)
    })

    headers = {
        "Content-type": "application/json"
    }

    result = requests.post(url, data=data, headers=headers)
    return json.loads(result.text)

json_data = capture_series(series, 1987, 2016)
json_data

df_data = pd.DataFrame(json_data['Results']['series'][0]['data'])
print(df_data)

df_sub = df_data[['value', 'year']].astype(float).astype(int)
df_sub.set_index('year', inplace=True)
df_sub.sort_index(inplace=True)
df_sub

x = df_sub.index
y = df_sub['value']

mp.plot(x,y)
mp.title('Major Sector Multifactor Productivity')
mp.xlabel('years')
mp.ylabel('values')
mp.show

当我运行代码时，我首先得到这个表，这是站点数据。

footnotes period periodName   value  year
 0      [{}]    A01     Annual  86.244  1996
 1      [{}]    A01     Annual  84.713  1995
 2      [{}]    A01     Annual  85.141  1994
 3      [{}]    A01     Annual  84.688  1993
 4      [{}]    A01     Annual  85.037  1992
 5      [{}]    A01     Annual  82.280  1991
 6      [{}]    A01     Annual  82.625  1990
 7      [{}]    A01     Annual  81.965  1989
 8      [{}]    A01     Annual  81.587  1988
 9      [{}]    A01     Annual  80.816  1987

错误日志显示了这一点（使用Jupyter w / Python 3作为参考）

ValueError Traceback (most recent call last)
<ipython-input-101-8ee6d83ca777> in <module>()
     41 print(df_data)
     42 
---> 43 df_sub = df_data[['value', 'year']].astype(int)
     44 df_sub.set_index('year', inplace=True)
     45 df_sub.sort_index(inplace=True)

     ...

     ValueError: invalid literal for int() with base 10: '86.244'

Answer 1

好的，我玩了你的例子。

我认为str列是.astype(float)类型。这意味着您需要先使用>>> data = {'value': {0: '84.713', 1: '85.141', 2: '84.688', 3: '85.037', 4: '82.280', 5: '82.625', 6: '81.965', 7: '81.587', 8: '80.816'}, 'year': {0: '1995', 1: '1994', 2: '1993', 3: '1992', 4: '1991', 5: '1990', 6: '1989', 7: '1988', 8: '1987'}} >>> df = pd.DataFrame(data) >>> df value year 0 84.713 1995 1 85.141 1994 2 84.688 1993 3 85.037 1992 4 82.280 1991 5 82.625 1990 6 81.965 1989 7 81.587 1988 8 80.816 1987 >>> df['value'].astype(int) # <- replicating eror Traceback (most recent call last): ValueError: invalid literal for int() with base 10: '84.713' >>> df['value'].astype(float).astype(int) # <= HERE 0 84 1 85 2 84 3 85 4 82 5 82 6 81 7 81 8 80 Name: value, dtype: int32。

下面：

df[['value', 'year']].astype(float).astype(int)

所以使用：

{{1}}

Python错误：int（）的文字无效

1 个答案: