创建文本数据的轴或摘要

时间:2014-11-14 18:17:27

标签: python python-2.7 numpy pandas

在旋转列表列表时,我对pandas包的错误是什么?

paramMatrix是一个列表清单:

def populateMatrix():
    mydir="C:\\Python27"
    os.chdir(mydir)
    paramMatrix=[]
    for file in glob.glob("*.txt1"):
        print (file)
        with open(mydir+'\\'+file) as f:
            for line in f:
                paramMatrix.append((file + '=' + line).split('='))
    doPivot(paramMatrix)


def doPivot(data):
    df=pandas.DataFrame(data, columns=['Fruit', 'Shop', 'Price'])
    print (df.pivot(index='Fruit', columns='Shop', values='Price'))

我目前得到的结果是:

Shop                            paframC paramA paramB    paramC  \
Fruit                                                             
New Text Document.txt1  nordfggdfgmal\n   Y3\n    NaN       NaN   
file.txt1                           NaN    Y\n   30\n  normal\n   

Shop                                          paramD           paramE  \
Fruit                                                                   
New Text Document.txt1  SOMEdgdfg_ITEM_IN_ALL_CAPS\n  5 6 7 4448 9 \n   
file.txt1                    SOME_ITEM_IN_ALL_CAPS\n     5 6 7 8 9 \n   

Shop                                      paramF paramG   pardamB  \
Fruit                                                               
New Text Document.txt1  /dir/7y7456456to/stuff\n    NaN  305456\n   
file.txt1                        /dir/to/stuff\n      y       NaN   

Shop                                     parfamG  
Fruit                                             
New Text Document.txt1  y33333333333333333333333  
file.txt1                                    NaN 

列表列表是paramMatrix,在运行doPivot函数之前看起来像这样:

[['file.txt1', 'paramA', 'Y\n'], ['file.txt1', 'paramB', '30\n'], ['file.txt1', 'paramC', 'normal\n'], ['file.txt1', 'paramD', 'SOME_ITEM_IN_ALL_CAPS\n'], ['file.txt1', 'paramE', '5 6 7 8 9 \n'], ['file.txt1', 'paramF', '/dir/to/stuff\n'], ['file.txt1', 'paramG', 'y'], ['New Text Document.txt1', 'paramA', 'Y3\n'], ['New Text Document.txt1', 'pardamB', '305456\n'], ['New Text Document.txt1', 'paframC', 'nordfggdfgmal\n'], ['New Text Document.txt1', 'paramD', 'SOMEdgdfg_ITEM_IN_ALL_CAPS\n'], ['New Text Document.txt1', 'paramE', '5 6 7 4448 9 \n'], ['New Text Document.txt1', 'paramF', '/dir/7y7456456to/stuff\n'], ['New Text Document.txt1', 'parfamG', 'y33333333333333333333333']]

或以易于阅读的格式格式化:

[['file.txt1', 'paramA', 'Y\n']
, ['file.txt1', 'paramB', '30\n']
, ['file.txt1', 'paramC', 'normal\n']
, ['file.txt1', 'paramD', 'SOME_ITEM_IN_ALL_CAPS\n']
, ['file.txt1', 'paramE', '5 6 7 8 9 \n']
, ['file.txt1', 'paramF', '/dir/to/stuff\n']
, ['file.txt1', 'paramG', 'y']
, ['New Text Document.txt1', 'paramA', 'Y3\n']
, ['New Text Document.txt1', 'pardamB', '305456\n']
, ['New Text Document.txt1', 'paframC', 'nordfggdfgmal\n']
, ['New Text Document.txt1', 'paramD', 'SOMEdgdfg_ITEM_IN_ALL_CAPS\n']
, ['New Text Document.txt1', 'paramE', '5 6 7 4448 9 \n']
, ['New Text Document.txt1', 'paramF', '/dir/7y7456456to/stuff\n']
, ['New Text Document.txt1', 'parfamG', 'y33333333333333333333333']
]

doPivot是来自here

的函数

我有一份清单清单:

+-------+--------+-----+
| file1 | parama | 112 |
| file1 | paramb |  54 |
| file1 | paramd | 234 |
| file2 | paramb |  63 |
| file2 | paramd | 334 |
| file2 | paramz |  11 |
| file3 | parama |  34 |
+-------+--------+-----+

我正试图将它转换为这样(标题行对我来说无关紧要):

+--------+-------+-------+-------+
|        | File1 | File2 | File3 |
| parama | 112   | -     | 34    |
| paramb | 54    | 63    | -     |
| paramd | 234   | 334   | -     |
| paramz | -     | 11    | -     |
+--------+-------+-------+-------+

在旋转列表列表时,我对pandas包的错误是什么?

请注意我已尝试[this approach as well][2];但是,我在这里遇到语法错误: enter image description here

1 个答案:

答案 0 :(得分:2)

在第一部分中,Shop应该是索引,而不是Fruit,如果您希望所需的结果与第二部分中的结果相似。如果是这种情况,您的代码效果很好。见下文。

import pandas as pd

list_ = [['file.txt1', 'paramA', 'Y\n']
, ['file.txt1', 'paramB', '30\n']
, ['file.txt1', 'paramC', 'normal\n']
, ['file.txt1', 'paramD', 'SOME_ITEM_IN_ALL_CAPS\n']
, ['file.txt1', 'paramE', '5 6 7 8 9 \n']
, ['file.txt1', 'paramF', '/dir/to/stuff\n']
, ['file.txt1', 'paramG', 'y']
, ['New Text Document.txt1', 'paramA', 'Y3\n']
, ['New Text Document.txt1', 'pardamB', '305456\n']
, ['New Text Document.txt1', 'paframC', 'nordfggdfgmal\n']
, ['New Text Document.txt1', 'paramD', 'SOMEdgdfg_ITEM_IN_ALL_CAPS\n']
, ['New Text Document.txt1', 'paramE', '5 6 7 4448 9 \n']
, ['New Text Document.txt1', 'paramF', '/dir/7y7456456to/stuff\n']
, ['New Text Document.txt1', 'parfamG', 'y33333333333333333333333']
]

df = pd.DataFrame(list_, columns=['File','Param','Values'])
df_p = df.pivot(index='Param',columns='File',values='Values')
print df_p

结果:

File           New Text Document.txt1                file.txt1
Param                                                         
paframC               nordfggdfgmal\n                      NaN
paramA                           Y3\n                      Y\n
paramB                            NaN                     30\n
paramC                            NaN                 normal\n
paramD   SOMEdgdfg_ITEM_IN_ALL_CAPS\n  SOME_ITEM_IN_ALL_CAPS\n
paramE                5 6 7 4448 9 \n             5 6 7 8 9 \n
paramF       /dir/7y7456456to/stuff\n          /dir/to/stuff\n
paramG                            NaN                        y
pardamB                      305456\n                      NaN
parfamG      y33333333333333333333333                      NaN

与第二部分类似的方法。

import pandas as pd

d = {'File': {0: 'file1',
  1: 'file1',
  2: 'file1',
  3: 'file2',
  4: 'file2',
  5: 'file2',
  6: 'file3'},
 'Param': {0: 'parama',
  1: 'paramb',
  2: 'paramd',
  3: 'paramb',
  4: 'paramd',
  5: 'paramz',
  6: 'parama'},
 'Values': {0: 112, 1: 54, 2: 234, 3: 63, 4: 334, 5: 11, 6: 34}}

df = pd.DataFrame.from_dict(d)
df_p = df.pivot(index='Param',columns='File',values='Values')

print df_p

同样的结果:

File    file1  file2  file3
Param                      
parama    112    NaN     34
paramb     54     63    NaN
paramd    234    334    NaN
paramz    NaN     11    NaN