根据列中的数值过滤数据框

时间:2021-04-28 11:47:53

标签: python-3.x pandas dataframe

我正在尝试根据列的值过滤数据框中的行。如果数值大于 2,则取那些行,如果小于 2,则不应取那些行。 示例数据框

| Number | Machine Name | Number of jobs|

|:-------|:------------:| -------------:|

| One    | Power Drill  | 1             |
| Two    | Wench        | 3             |
|Three   | Screwdriver  | 9             |

期望的结果:代码应该查看第三列“作业数量”,看看数字是否大于 2,然后它应该接受它,否则它应该忽略它。 这是我的代码:

import pandas as pd
import openpyxl
import glob
import os

source_folder = r'C:\Users\Ahmed_Abdelmuniem\Desktop\YYY'

file_names = glob.glob(os.path.join(source_folder, '*.xlsm'))

target_file = 'XXX.xlsm'

table_2 = pd.read_excel(target_file)

job_list = [3,4,5,6,7,8,9,10]
new_df = table_2['Number of jobs'].isin(job_list)

我收到以下错误:

C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\python.exe "C:/Users/Ahmed_Abdelmuniem/PycharmProjects/Rig Visualisation/main.py"
Traceback (most recent call last):
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Number of jobs'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\Ahmed_Abdelmuniem\PycharmProjects\Rig Visualisation\main.py", line 57, in <module>
    new_df = table_2['Number of jobs'].isin(job_list)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\frame.py", line 3024, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:\Users\Ahmed_Abdelmuniem\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py", line 3082, in get_loc
    raise KeyError(key) from err
KeyError: 'Number of jobs'

Process finished with exit code 1

编辑:经过进一步调查,我发现了以下内容:

  Unnamed: 0 Unnamed: 1 Unnamed: 2 Unnamed: 3
0           NaN     Number   Machine Name       Jobs
1           NaN          1      Power Drill    1
2           NaN          2      Wench          4
3           NaN          3      Screwdriver    5

pandas 似乎没有读取列名,而是将它们读取为未命名的,这是为什么?

0 个答案:

没有答案