使用ndarray解压缩numpy

时间:2018-05-22 02:12:36

标签: python matplotlib numpy-ndarray

我是python的新手。任何帮助将不胜感激。

我想show this graph ,使用我尝试过的第一个代码块,但是当我尝试运行此代码时:

date, value = np.loadtxt(revenue_ar, delimiter=',', unpack=True, converters={ 0: bytespdate2num('%Y-%m-%d')})

使用revenue_ar(numpy.ndarray)弹出此错误消息:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

第一段代码:

import time
import requests
import intrinio
import pandas as pd
import numpy as np    

api_username = 'hidden'
api_password = 'hidden'

def bytespdate2num(fmt, encoding='utf-8'):
    strconverter = mdates.strpdate2num(fmt)
    def bytesconverter(b):
        s = b.decode(encoding)
        return strconverter(s)
    return bytesconverter

ticker = 'AAPL' 
revenue_data = requests.get('https://api.intrinio.com/historical_data?identifier=' + ticker + '&item=totalrevenue', auth=(api_username, api_password))
revenue1 = revenue_data.json()['data'] 
revenue = pd.DataFrame(revenue1) 
revenue_ar = revenue.values

date, value = np.loadtxt(revenue_ar, delimiter=',', unpack=True,
                                   converters={ 0: bytespdate2num('%Y-%m-%d')})('%Y-%m-%d')})('%Y-%m-%d')})

fig = plt.figure()
ax1 = plt.subplot2grid((6,4), (0,0), rowspan=6, colspan=4)
ax1.plot(date,value)
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
plt.show()

但是,这似乎可以使用revenue.txt

date, value = np.loadtxt('revenue.txt', delimiter='\t', unpack=True,
                                   converters={0: bytespdate2num('%Y-%m-%d')})

如果我需要进一步澄清我的问题,请告诉我。 提前谢谢。

revenue1:

[{'date': '2018-03-31', 'value': 247417000000.0},
{'date': '2017-12-30', 'value': 239176000000.0},
{'date': '2017-09-30', 'value': 229234000000.0},
{'date': '2017-07-01', 'value': 223507000000.0},
{'date': '2017-04-01', 'value': 220457000000.0},
{'date': '2016-12-31', 'value': 218118000000.0},
{'date': '2016-09-24', 'value': 215639000000.0},
{'date': '2016-06-25', 'value': 220288000000.0},
{'date': '2016-03-26', 'value': 227535000000.0},
{'date': '2015-12-26', 'value': 234988000000.0},
{'date': '2015-09-26', 'value': 233715000000.0},
{'date': '2015-06-27', 'value': 224337000000.0},
{'date': '2015-03-28', 'value': 212164000000.0},

revenue_ar:

array([['2018-03-31', 247417000000.0],
       ['2017-12-30', 239176000000.0],
       ['2017-09-30', 229234000000.0],
       ['2017-07-01', 223507000000.0],
       ['2017-04-01', 220457000000.0],
       ['2016-12-31', 218118000000.0],
       ['2016-09-24', 215639000000.0],
       ['2016-06-25', 220288000000.0],
       ['2016-03-26', 227535000000.0],
       ['2015-12-26', 234988000000.0],
       ['2015-09-26', 233715000000.0],

revenue.txt:

2007-09-29  2.457800e+10
2008-09-27  3.749100e+10
2009-09-26  4.290500e+10
2009-12-26  4.670800e+10
2010-03-27  5.112300e+10
2010-06-26  5.708900e+10
2010-09-25  6.522500e+10
2010-12-25  7.628300e+10
2011-03-26  8.745100e+10
2011-06-25  1.003220e+11
2011-09-24  1.082490e+11

这将是您所建议的解决方案。 这很棒,因为它运行顺畅。

import time
import urllib.request
from urllib.request import urlopen
import requests
import intrinio
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import matplotlib.dates as mdates
import datetime

api_username = 'hidden'
api_password = 'hidden'

def grab_intrinio(ticker):
    try:
        revenue_data = requests.get('https://api.intrinio.com/historical_data?    identifier=' + ticker + '&item=totalrevenue', auth=(api_username, api_password))
        revenue1 = revenue_data.json()['data'] 
        revenue = pd.DataFrame(revenue1)
        revenue['date'] = pd.to_datetime(revenue['date'])

        plt.plot(revenue['date'], revenue['value'])

    except Exception as e:
        print('failed in the main loop',str(e))
        pass

grab_intrinio('AAPL')

这产生的输出为:

revenue graph

**我还有2件事需要处理。 首先,我想绘制另外两个变量(net_income和roe)

其次,我的roe数据的值为nm,无法转换为float或integer。

我该如何解决这个问题?**

作为最终输出,我想显示一个这样的图形(我可以自己完成关于图和配置细节的工作):

final graph

我已尝试过这一行,但这似乎不适用于显示'list' object has no attribute 'plot'.的错误

fig = plt.figure()

    ax1 = plt.plot(net_income['date'], net_income['value'])
    ax1.plot(net_income['date'], net_income['value'])

    ax2 = plt.plot(revenue['date'], revenue['value'])
    ax2.plot(revenue['date'], revenue['value'])

这个在相同的情节中产生net_income和收入:

plt.plot(net_income['date'], net_income['value'])
plt.plot(revenue['date'], revenue['value'])

enter image description here

  

块引用

以下是net_income和roe的代码(格式与收入相同)

net_income_data = requests.get('https://api.intrinio.com/historical_data?identifier=' + ticker + '&item=totalrevenue', auth=(api_username, api_password))
net_income1 = net_income_data.json()['data']
net_income = pd.DataFrame(net_income1)
net_income['date'] = pd.to_datetime(net_income['date'])        

roe_data = requests.get('https://api.intrinio.com/historical_data?identifier=' + ticker + '&item=roe', auth=(api_username, api_password))
roe1 = roe_data.json()['data']
roe = pd.DataFrame(roe1)
roe['date'] = pd.to_datetime(revenue['date'])

这是一个nm value

的roe_date
    date    value
30  2010-09-25  0.352835
31  2010-06-26  0.354701
32  2010-03-27  0.274779
33  2009-12-26  0.261631
34  2009-09-26  0.305356
35  2008-09-27  0.274432
36  2007-09-29  nm

以下是roe.dtypes

的结果
In: roe.dtypes
Out: date     datetime64[ns]
     value            object
     dtype: object

然而,net_income.dtypesrevenue.dtypes都产生如下输出:

In: net_income.dtypes(revenue.dtypes)
Out: date     datetime64[ns]
     value           float64
     dtype: object

您对从对象转换为浮动的roe的修改工作用于绘制图形。当我将函数聚合为最后一步时,我收到invalid syntax错误,如下所示:

File "<ipython-input-141-537d7c6c91a3>", line 28
    fig axs = plt.subplots(3)

对于在您的帮助下编写的此功能。

def grab_intrinio(ticker):
    try:
        net_income_data = requests.get('https://api.intrinio.com/historical_data?identifier=' + ticker + '&item=netincome', auth=(api_username, api_password)) # 
        net_income1 = net_income_data.json()['data']
        net_income = pd.DataFrame(net_income1)
        net_income['date'] = pd.to_datetime(net_income['date'])

        revenue_data = requests.get('https://api.intrinio.com/historical_data?identifier=' + ticker + '&item=totalrevenue', auth=(api_username, api_password))
        revenue1 = revenue_data.json()['data']
        revenue = pd.DataFrame(revenue1)

        revenue['date'] = pd.to_datetime(revenue['date'])
        revenue

        roe_data = requests.get('https://api.intrinio.com/historical_data?identifier=' + ticker + '&item=roe', auth=(api_username, api_password))
        roe1 = roe_data.json()['data']
        roe = pd.DataFrame(roe1)
        roe['date'] = pd.to_datetime(roe['date'])
        roe.index = roe['date']
        roe = roe.drop(columns=['date'])
        nm_idx = roe['value'] =='nm'

        roe.value[nm_idx] = np.nan
        roe.value = roe.value.astype(float)

        fig axs = plt.subplots(3)
        for ax, dat in zip(axs, [net_income, Revenue, roc]):
            ax.plot(dat['date'], dat['value'])

    except exception as e:
        print('failed in the main loop',str(e))
        pass

grab_intrinio('AAPL')    

提前感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

np.loadtxt需要一个文件名或一个字符串变量,它可以从中解析数据。这就是为什么它通过告诉它一条路径而不是通过告诉它一组值来工作。

所以你显然可以通过requests.get获得有效的json数据并通过

进行解码
revenue1 = revenue_data.json()['data']

并将其放在带有

的数据框中
df = pd.DataFrame(revenue1)

这就是它的样子:

In: df.head()
Out: 
         date         value
0  2018-01-31  247417000000
1  2017-12-30  239176000000
2  2017-09-30  229234000000
3  2017-07-01  223507000000

这是检查数据框中列的数据类型的方法:

In: df.dtypes
Out: 
date     object
value     int64
dtype: object

value是一个整数,这很好,但是date没有被解析,它只是对象数据,所以让我们解决这个问题:

df['date'] = pd.to_datetime(df['date'])

In: df
Out: 
        date         value
0 2018-01-31  247417000000
1 2017-12-30  239176000000
2 2017-09-30  229234000000
3 2017-07-01  223507000000

In: df.dtypes
Out: 
date     datetime64[ns]
value             int64
dtype: object    df = df.drop(columns=['date'])

现在date具有正确的数据类型,您可以将其绘制为

plt.plot(df['date'], df['value'])

enter image description here

但是,如果您将日期作为索引,则可以使它更方便:

df.index = pd.to_datetime(df['date'])
df = df.drop(columns=['date'])

因为您可以直接致电

df.plot()

因为pandas有一个matplotlib接口。

[![在此处输入图像说明] [2]] [2]

对于你的三重图,你需要像:

fig axs = plt.subplots(3)
for ax, dat in zip(axs, [net_income, Revenue, roc]):
    ax.plot(dat['date'], dat['value'])

由于nm - 条目,您的部分数据无法转换为浮点数。将它们替换为np.nan,以便绘图命令可以处理它,您可以使用您的数据:

In: roe
Out: 
          date     value
30  2010-09-25  0.352835
31  2010-06-26  0.354701
32  2010-03-27  0.274779
33  2009-12-26  0.261631
34  2009-09-26  0.305356
35  2008-09-27  0.274432
36  2007-09-29        nm

roe.index = roe['date']
roe = roe.drop(columns=['date'])
nm_idx = roe['value'] =='nm'

roe.value[nm_idx] = np.nan
roe.value = roe.value.astype(float)

In: roe
Out: 
               value
date                
2010-09-25  0.352835
2010-06-26  0.354701
2010-03-27  0.274779
2009-12-26  0.261631
2009-09-26  0.305356
2008-09-27  0.274432
2007-09-29       NaN

In: roe.dtypes
Out: 
value    float64
dtype: object

roe.plot()

enter image description here