读取CSV文件时出现问题

时间:2018-12-04 12:30:23

标签: python-3.x charts

我是Python的新手,

我正在尝试将列加载到python文件中,然后显示图表,但我不断收到数百万个错误。

我有2列的csv文件。

我想做的就是调用列并将其显示在图表上!我最初使用数据框,但经过多次尝试后才在这里使用。

代码

import matplotlib.pyplot as plt
import csv

import pandas as pd
with open('religion.csv') as file:
  reader = csv.reader(file)

  count = 0

  for row in reader:
      print(row)

      if count > 5:
          break
      count +=1


# use the scatter function
#plt.scatter(x, y, alpha=0.5)

x = reader['religions']
y = reader['students']
plt.scatter(x, y, alpha=0.5)

plt.show()

excel文件

enter image description here

文件和代码

enter image description here

样本数据

   religions          schuler
Romisch-Katholisch     371
Moslem                 298
Ohne Bekenntnis        182
Serbisch-Orthodox      120
Evangelisch A.B.        26
Rumnisch-Orthodox       15
Sonstige Religion       9

更新代码(仍然无法正常工作)

import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_fwf('religion.csv')

df.columns.tolist()
x = df['religions']
y = df['schuler']

df.columns.tolist()
plt.scatter(x, y, alpha=0.5)
plt.show()

文件夹位置

enter image description here

当前错误

KeyError

    Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3077             try:
-> 3078                 return self._engine.get_loc(key)
   3079             except KeyError:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'religions'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-6-f2e811496fb9> in <module>()
----> 1 x = df['religions']

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   2686             return self._getitem_multilevel(key)
   2687         else:
-> 2688             return self._getitem_column(key)
   2689 
   2690     def _getitem_column(self, key):

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
   2693         # get column
   2694         if self.columns.is_unique:
-> 2695             return self._get_item_cache(key)
   2696 
   2697         # duplicate columns & possible reduce dimensionality

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
   2487         res = cache.get(item)
   2488         if res is None:
-> 2489             values = self._data.get(item)
   2490             res = self._box_item_values(item, values)
   2491             cache[item] = res

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath)
   4113 
   4114             if not isna(item):
-> 4115                 loc = self.items.get_loc(item)
   4116             else:
   4117                 indexer = np.arange(len(self.items))[isna(self.items)]

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3078                 return self._engine.get_loc(key)
   3079             except KeyError:
-> 3080                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   3081 
   3082         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'religions'

2 个答案:

答案 0 :(得分:1)

1。在读取CSV文件时,您需要指定sep =';'

df = pd.read_csv("C:/Test/rel.csv", sep=';')

df
Out[417]: 
            religions  schuler
0  Romisch-Katholisch      371
1              Moslem      298
2     Ohne Bekenntnis      182
3   Serbisch-Orthodox      120
4    Evangelisch A.B.       26
5   Rumnisch-Orthodox       15
6   Sonstige Religion        9

2。您可以在pandas中使用pd.plot(内置函数)对其进行绘制

这在后台使用matplotlib,您可以指定x和y列。 (我使用过“条形图”,但您可以使用this reference中的任何其他类型):

df.plot(x='religions', y= 'schuler', kind='bar')


Out[418]: <matplotlib.axes._subplots.AxesSubplot at 0xae7e518>
[Plot image]

图片链接:https://i.stack.imgur.com/8u0xs.png

答案 1 :(得分:0)

使用pandas和matplotlib很好。尝试像这样导入CSV文件:

df = pd.read_csv("religion.csv")

如果您的CSV文件没有列标题名称,请将它们作为列表传递给 name 参数。另外,如果您不希望DF的第一列作为索引列,请将 index_col 参数设置为False。您可以在以下位置查看与 read_csv 相关的文档:https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

然后使用pyplot绘制数据:

plt.scatter(df['religions'], df['students'])
plt.show()