如何在python中绘制3D hist

时间:2014-04-11 10:02:39

标签: python arrays matplotlib

我的数据集中包含一年内发生事故的数据集。

> Accident_ID   Region   Year  
> 213              1       2003 
> 234              2       2001
> 334              2       2004
> ..


years= [0.0, 1661.0, 1665.0, 1706.0, 1729.0, 1765.0, 1779.0, 1780.0, 1785.0, 1798.0, 1799.0, 1800.0, 1801.0, 1802.0, 1804.0, 1805.0, 1812.0, 1814.0, 1816.0, 1821.0, 1822.0, 1824.0, 1825.0, 1826.0, 1827.0, 1829.0, 1830.0, 1831.0, 1832.0, 1833.0, 1834.0, 1835.0, 1836.0, 1837.0, 1838.0, 1839.0, 1840.0, 1841.0, 1842.0, 1843.0, 1844.0, 1845.0, 1846.0, 1847.0, 1848.0, 1849.0, 1850.0, 1851.0, 1852.0, 1853.0, 1854.0, 1855.0, 1856.0, 1857.0, 1858.0, 1859.0, 1860.0, 1861.0, 1862.0, 1863.0, 1864.0, 1865.0, 1866.0, 1867.0, 1868.0, 1869.0, 1870.0, 1871.0, 1872.0, 1873.0, 1874.0, 1875.0, 1876.0, 1877.0, 1878.0, 1879.0, 1880.0, 1881.0, 1882.0, 1883.0, 1884.0, 1885.0, 1886.0, 1887.0, 1888.0, 1889.0, 1890.0, 1891.0, 1892.0, 1893.0, 1894.0, 1895.0, 1896.0, 1897.0, 1898.0, 1899.0, 1900.0, 1901.0, 1902.0, 1903.0, 1904.0, 1905.0, 1906.0, 1907.0, 1908.0, 1909.0, 1910.0, 1911.0, 1912.0, 1913.0, 1914.0, 1915.0, 1916.0, 1917.0, 1918.0, 1919.0, 1920.0, 1921.0, 1922.0, 1923.0, 1924.0, 1925.0, 1926.0, 1927.0, 1928.0, 1929.0, 1930.0, 1931.0, 1932.0, 1933.0, 1934.0, 1935.0, 1936.0, 1937.0, 1938.0, 1939.0, 1940.0, 1941.0, 1942.0, 1943.0, 1944.0, 1945.0, 1946.0, 1947.0, 1948.0, 1949.0, 1950.0, 1951.0, 1952.0, 1953.0, 1954.0, 1955.0, 1956.0, 1957.0, 1958.0, 1959.0, 1960.0, 1961.0, 1962.0, 1963.0, 1964.0, 1965.0, 1966.0, 1967.0, 1968.0, 1969.0, 1970.0, 1971.0, 1972.0, 1973.0, 1974.0, 1975.0, 1976.0, 1977.0, 1978.0, 1979.0, 1980.0, 1981.0, 1982.0, 1983.0, 1984.0, 1985.0, 1986.0, 1987.0, 1988.0, 1989.0, 1990.0, 1991.0, 1992.0, 1993.0, 1994.0, 1995.0, 1996.0, 1997.0, 1998.0, 1999.0, 2000.0, 2001.0, 2002.0, 2003.0, 2004.0, 2005.0, 2006.0, 2007.0, 2008.0, 2009.0, 2010.0, 2011.0, 2012.0, 2013.0]

Frequency_accidents_years= [44815, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 556, 1, 1, 1, 1, 1, 1, 1, 1, 3, 4, 2, 5, 3, 20, 11, 6, 5, 1, 7, 6, 6, 2, 4, 6, 19, 9, 11, 10, 18, 18, 8, 9, 13, 20, 43, 7, 13, 11, 6, 13, 12, 6, 7, 9, 34, 3, 2, 3, 1, 7, 6, 4, 8, 11, 56, 18, 5, 4, 4, 16, 2, 1, 3, 3, 146, 49, 10, 7, 10, 22, 18, 14, 18, 17, 397, 46, 12, 14, 12, 53, 39, 18, 28, 25095, 9663, 26717, 131, 180, 268, 7660, 754, 641, 354, 873, 47024, 705, 720, 578, 598, 16547, 653, 516, 255, 296, 92079, 1161, 1175, 1634, 2111, 71121, 3158, 3289, 4355, 2136, 77654, 33007, 1253, 983, 365, 25554, 651, 665, 762, 968, 38485, 745, 326, 199, 176, 25048, 343, 368, 604, 753, 46674, 775, 683, 562, 645, 26992, 768, 959, 816, 922, 37271, 796, 915, 1101, 945, 19687, 618, 614, 620, 509, 17169, 497, 623, 853, 854, 9755, 662, 725, 999, 593, 5469, 554, 778, 1163, 1342, 3470, 3755, 3810, 3597, 3613, 3504, 2263, 3173, 2465, 2135, 2558, 3476, 3164, 2755, 3715, 4187, 4540, 4203, 4445, 6541, 5994, 4873, 4085, 2899, 1806, 1157, 1331, 1246, 424]

regions = xrange(1,100) // Can be generated this way eg: region1, region2 .. 

我想在3D直方图中绘制这些数据,以便更好地分析数据集。 我想通过地区/年度绘制事故频率

from collections import Counter
data.pandas("file.csv")

.. 
.. 

#Make 3D Hist
fig = plt.figure()
ax = plt.axes(projection='3d')
ax.plot(xs=years, ys=Frequency_accidents_years, zs=regions, marker='o', linestyle='--', color='r',label="name")
plt.show()

我最终遇到了这个问题。

我收到了这个错误:

line 49 ..

  File "/Users/macbook/anaconda/lib/python2.7/site-packages/matplotlib/pyplot.py", line 561, in savefig
    return fig.savefig(*args, **kwargs)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/matplotlib/figure.py", line 1421, in savefig
    self.canvas.print_figure(*args, **kwargs)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 2220, in print_figure
    **kwargs)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 1962, in print_png
    return agg.print_png(*args, **kwargs)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/matplotlib/backends/backend_agg.py", line 505, in print_png
    FigureCanvasAgg.draw(self)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/matplotlib/backends/backend_agg.py", line 451, in draw
    self.figure.draw(self.renderer)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/matplotlib/artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/matplotlib/figure.py", line 1034, in draw
    func(*args)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/mpl_toolkits/mplot3d/axes3d.py", line 270, in draw
    Axes.draw(self, renderer)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/matplotlib/artist.py", line 55, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/matplotlib/axes.py", line 2086, in draw
    a.draw(renderer)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/mpl_toolkits/mplot3d/art3d.py", line 117, in draw
    xs, ys, zs = proj3d.proj_transform(xs3d, ys3d, zs3d, renderer.M)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/mpl_toolkits/mplot3d/proj3d.py", line 194, in proj_transform
    return proj_transform_vec(vec, M)
  File "/Users/macbook/anaconda/lib/python2.7/site-packages/mpl_toolkits/mplot3d/proj3d.py", line 153, in proj_transform_vec
    vecw = np.dot(M, vec)
ValueError: operands could not be broadcast together with shapes (210) (858312)

2 个答案:

答案 0 :(得分:3)

我试图想出一个如何使用Panda的groupby给出数据集结构来绘制Matplotlib bar3d 3d直方图的示例。的 Davidmh

我根据您在上面发布的内容使用了以下数据集( accident.csv ):

Accident_ID,Region,Year
213,1,2003
214,1,2003
214,2,2008
213,2,2007
210,2,2007
210,3,2004
210,1,2004
213,1,2004
210,1,2004

此脚本读取数据集,对数据进行分组并构建3d直方图:

import matplotlib
import matplotlib.pyplot as plt

from pandas import read_csv
from mpl_toolkits.mplot3d import Axes3D

# Read CSV dataset file

df = read_csv('accidents.csv')

# Group by year and region

group_year_region = df.groupby(['Year', 'Region'])
group_keys = group_year_region.groups.keys()

# Get the years and regions series

xpos = map(lambda k: k[0] - 0.5, group_keys)
ypos = map(lambda k: k[1] - 0.5, group_keys)
zpos = [0] * len(xpos)

# Count number of accidents by (year, region) group

acc_by_year_region = group_year_region.count()['Accident_ID']

dx = 1
dy = 1
dz = [acc_by_year_region[key] for key in group_keys]

# Plot bar3d histogram

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.bar3d(xpos, ypos, zpos, dx, dy, dz, color='b', zsort='average')

# Years should not be presented in exponential notation

x_formatter = matplotlib.ticker.ScalarFormatter(useOffset=False)
ax.xaxis.set_major_formatter(x_formatter)

# Set labels and show plot

ax.set_xticks([k[0] for k in group_keys])
ax.set_yticks([k[1] for k in group_keys])
ax.set_zticks(dz)

ax.set_xlabel('Years')
ax.set_ylabel('Regions')
ax.set_zlabel('# Accidents')
plt.show()

情节如下:

enter image description here

修改

轴的构建方式存在问题且值无序。它现在已修复,情节图像已更新。

答案 1 :(得分:0)

问题在于数据的形状不会相加。你有三个向量,它们必须具有相同的长度。

运行你的例子,年份和频率_...都有210个元素,但区域只有99个。我认为你想要做的是按区域过滤事故,你可以使用什么Panda' { {1}}。