在熊猫中制作一个功能表

时间:2018-06-01 09:18:36

标签: python pandas dataframe

我有两个文件夹,其中包含每个文件的文件 -

file1          file2
a 32           b  32
b 23           d  12
c 28           r  7

请注意,每个单词中的所有字母都不是强制性的......它们可以是任何顺序。

现在我想创建一个这种格式的表 -

 a   b    c   d........r s t....   class
32  23    28  0
0    32   0   12       7 0 0 ...    0


...................................... 1  

每行包含文件中的字母值,如果字母不存在,那么0.class 0表示第一个文件夹中的文件,1表示第二个文件夹。

我的尝试 -

import os
import pandas as pd



dir_list = "........","........"] #CHANGE INPUT PATH

df = pd.DataFrame(columns=['class'])
count=0
for l in dir_list: 
    for root, dirs, files in os.walk(l):
        for name in files:
            outfile2 = open(root+"/"+name,'r')
            line = outfile2.readline()
            print(name)
            count+=1

            while line:
                words=line.split(" ")

                if words[0] not in df.columns:
                    df[words[0]]=words[1]

                elif words[0] in df.columns:
                    df.iloc[count-1][words[0]]=words[1]



                line = outfile2.readline()

            if l=="   ":
               df[count-1]['class']='M'
            else:
               df[count-1]['class']='B'


df=df.fillna(0)

print(df)      

错误 -

Traceback (most recent call last):
  File "E:\anaconda\lib\site-packages\pandas\core\indexes\base.py", line 2525, in get_loc
    return self._engine.get_loc(key)
  File "pandas/_libs/index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "new.py", line 34, in <module>
    df[count-1]['class']='M'
  File "E:\anaconda\lib\site-packages\pandas\core\frame.py", line 2139, in __getitem__
    return self._getitem_column(key)
  File "E:\anaconda\lib\site-packages\pandas\core\frame.py", line 2146, in _getitem_column
    return self._get_item_cache(key)
  File "E:\anaconda\lib\site-packages\pandas\core\generic.py", line 1842, in _get_item_cache
    values = self._data.get(item)
  File "E:\anaconda\lib\site-packages\pandas\core\internals.py", line 3843, in get
    loc = self.items.get_loc(item)
  File "E:\anaconda\lib\site-packages\pandas\core\indexes\base.py", line 2527, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas/_libs/index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 0

enter image description here

0 个答案:

没有答案
相关问题