如何从特定的列中提取数据并进行遍历?

时间:2019-02-27 11:50:29

标签: python-3.x pandas csv dataframe

我有5个csv格式的文件,这些文件具有这样的数据-

PID,                 STARTED,%CPU,%MEM,COMMAND
    1,Wed Sep 12 10:10:21 2018, 0.0, 0.0,init
    2,Wed Sep 12 10:10:21 2018, 0.0, 0.0,kthreadd
    3,Wed Sep 12 10:10:21 2018, 0.0, 0.0,migration/0
    4,Wed Sep 12 10:10:21 2018, 0.0, 0.0,ksoftirqd/0
    5,Wed Sep 12 10:10:21 2018, 0.0, 0.0,stopper/0

现在我要做-

1. Extract the pid, %mem and command from the files and store it in a iterable varible like list or array.
2. Compare the extected data from file1 with extracted data with file2, file3, file4, file5.
3. Finally to find whether any process is being repeated or not.

到目前为止,我已经找到pandas库来提取数据,但是它不能正常工作。请看一下我的代码,并向我建议更改-

    import pandas as pd
    df=pd.read_csv(file)
    pidList=[]
    #print(df['PID'])
    pidList=df['PID']
    print(pidList)

我正在发现错误:

 File "C:/Users/manoj.kumar5/Desktop/MLRep/MemoryLeakDetails/Check.py", line 79, in <module>
    pidList=df['PID']
  File "C:\Python3.7\lib\site-packages\pandas\core\frame.py", line 2927, in __getitem__
    indexer = self.columns.get_loc(key)
  File "C:\Python3.7\lib\site-packages\pandas\core\indexes\base.py", line 2658, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'PID'

任何帮助都是可取的。谢谢。

0 个答案:

没有答案