我有5个csv格式的文件,这些文件具有这样的数据-
PID, STARTED,%CPU,%MEM,COMMAND
1,Wed Sep 12 10:10:21 2018, 0.0, 0.0,init
2,Wed Sep 12 10:10:21 2018, 0.0, 0.0,kthreadd
3,Wed Sep 12 10:10:21 2018, 0.0, 0.0,migration/0
4,Wed Sep 12 10:10:21 2018, 0.0, 0.0,ksoftirqd/0
5,Wed Sep 12 10:10:21 2018, 0.0, 0.0,stopper/0
现在我要做-
1. Extract the pid, %mem and command from the files and store it in a iterable varible like list or array.
2. Compare the extected data from file1 with extracted data with file2, file3, file4, file5.
3. Finally to find whether any process is being repeated or not.
到目前为止,我已经找到pandas库来提取数据,但是它不能正常工作。请看一下我的代码,并向我建议更改-
import pandas as pd
df=pd.read_csv(file)
pidList=[]
#print(df['PID'])
pidList=df['PID']
print(pidList)
我正在发现错误:
File "C:/Users/manoj.kumar5/Desktop/MLRep/MemoryLeakDetails/Check.py", line 79, in <module>
pidList=df['PID']
File "C:\Python3.7\lib\site-packages\pandas\core\frame.py", line 2927, in __getitem__
indexer = self.columns.get_loc(key)
File "C:\Python3.7\lib\site-packages\pandas\core\indexes\base.py", line 2658, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'PID'
任何帮助都是可取的。谢谢。