用于地图绘制的pandas错误与basemap / proj

时间:2014-04-17 14:37:36

标签: python pandas data-analysis matplotlib-basemap proj

我在下面运行了Python代码,这是"绘图地图:可视化海地地震危机数据"在一本书上, Python for Data Analysis 。页面242-246

该代码应该创建一个海地的情节地图,但我收到如下错误:

Traceback (most recent call last):
  File "Haiti.py", line 74, in <module>
    x, y = m(cat_data.LONGITUDE, cat_data.LATITUDE)
  File "/usr/local/lib/python2.7/site-packages/mpl_toolkits/basemap/__init__.py", line 1148, in __call__
    xout,yout = self.projtran(x,y,inverse=inverse)
  File "/usr/local/lib/python2.7/site-packages/mpl_toolkits/basemap/proj.py", line 286, in __call__
    outx,outy = self._proj4(x, y, inverse=inverse)
  File "/usr/local/lib/python2.7/site-packages/mpl_toolkits/basemap/pyproj.py", line 388, in __call__
    _proj.Proj._fwd(self, inx, iny, radians=radians, errcheck=errcheck)
  File "_proj.pyx", line 122, in _proj.Proj._fwd (src/_proj.c:1571)
RuntimeError

我检查了我的机器上是否安装了 mpl_toolkits.basemap proj 模块。底图是从源as instructed安装的,而proj是由Homebrew安装的,它们看起来很好。

如果您安装了底图和proj,此代码是否成功运行?如果没有,您认为它是模块安装问题,代码本身还是其他任何问题?

Haiti.csv 文件可以从https://github.com/pydata/pydata-book/raw/master/ch08/Haiti.csv

下载
import pandas as pd
import numpy as np
from pandas import DataFrame

data = pd.read_csv('Haiti.csv')

data = data[(data.LATITUDE > 18) & (data.LATITUDE < 20) &
        (data.LONGITUDE > -75) & (data.LONGITUDE < -70)
        & data.CATEGORY.notnull()]

def to_cat_list(catstr):
    stripped = (x.strip() for x in catstr.split(','))
    return [x for x in stripped if x]

def get_all_categories(cat_series):
    cat_sets = (set(to_cat_list(x)) for x in cat_series) 
    return sorted(set.union(*cat_sets))

def get_english(cat):
    code, names = cat.split('.') 
    if '|' in names:
        names = names.split(' | ')[1] 
    return code, names.strip()

all_cats = get_all_categories(data.CATEGORY)
english_mapping = dict(get_english(x) for x in all_cats)

def get_code(seq):
    return [x.split('.')[0] for x in seq if x]

all_codes = get_code(all_cats)
code_index = pd.Index(np.unique(all_codes))
dummy_frame = DataFrame(np.zeros((len(data), len(code_index))),
                        index=data.index, columns=code_index)

for row, cat in zip(data.index, data.CATEGORY): 
    codes = get_code(to_cat_list(cat)) 
    dummy_frame.ix[row, codes] = 1

data = data.join(dummy_frame.add_prefix('category_'))

from mpl_toolkits.basemap import Basemap 
import matplotlib.pyplot as plt

def basic_haiti_map(ax=None, lllat=17.25, urlat=20.25, lllon=-75, urlon=-71):
    # create polar stereographic Basemap instance. 
    m = Basemap(ax=ax, projection='stere', 
                lon_0=(urlon + lllon) / 2, 
                lat_0=(urlat + lllat) / 2,
                llcrnrlat=lllat, urcrnrlat=urlat, 
                llcrnrlon=lllon, urcrnrlon=urlon, 
                resolution='f')
    # draw coastlines, state and country boundaries, edge of map. m.drawcoastlines()
    m.drawstates()
    m.drawcountries()
    return m

fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(12, 10)) 
fig.subplots_adjust(hspace=0.05, wspace=0.05)

to_plot = ['2a', '1', '3c', '7a']

lllat=17.25; urlat=20.25; lllon=-75; urlon=-71

for code, ax in zip(to_plot, axes.flat):
    m = basic_haiti_map(ax, lllat=lllat, urlat=urlat,
                        lllon=lllon, urlon=urlon) 

    cat_data = data[data['category_%s' % code] == 1]

    # compute map proj coordinates.
    print cat_data.LONGITUDE, cat_data.LATITUDE
    x, y = m(cat_data.LONGITUDE, cat_data.LATITUDE)

    m.plot(x, y, 'k.', alpha=0.5)
    ax.set_title('%s: %s' % (code, english_mapping[code]))

1 个答案:

答案 0 :(得分:5)

这可以通过将m(cat_data.LONGITUDE,cat_data.LATITUDE)更改为m(cat_data.LONGITUDE.values,cat_data.LATITUDE.values)来解决,这要归功于Alex Messina's finding

随着对我的进一步研究,pandas改变了DataFrame的系列数据(派生自NDFrame)应该与.values一起传递给Cython函数,如自2013年12月31日发布的v0.13.0以来的basemap / proj。 / p>

来自pandas的github commit log

+.. warning::
 +
 +   In 0.13.0 since ``Series`` has internaly been refactored to no longer sub-class ``ndarray``
 +   but instead subclass ``NDFrame``, you can **not pass** a ``Series`` directly as a ``ndarray`` typed parameter
 +   to a cython function. Instead pass the actual ``ndarray`` using the ``.values`` attribute of the Series.
 +
 +   Prior to 0.13.0
 +
 +   .. code-block:: python
 +
 +        apply_integrate_f(df['a'], df['b'], df['N'])
 +
 +   Use ``.values`` to get the underlying ``ndarray``
 +
 +   .. code-block:: python
 +
 +        apply_integrate_f(df['a'].values, df['b'].values, df['N'].values)

您可以找到示例代码here的更正版本。