在旋转列表列表时,我对pandas包的错误是什么?
paramMatrix
是一个列表清单:
def populateMatrix():
mydir="C:\\Python27"
os.chdir(mydir)
paramMatrix=[]
for file in glob.glob("*.txt1"):
print (file)
with open(mydir+'\\'+file) as f:
for line in f:
paramMatrix.append((file + '=' + line).split('='))
doPivot(paramMatrix)
def doPivot(data):
df=pandas.DataFrame(data, columns=['Fruit', 'Shop', 'Price'])
print (df.pivot(index='Fruit', columns='Shop', values='Price'))
我目前得到的结果是:
Shop paframC paramA paramB paramC \
Fruit
New Text Document.txt1 nordfggdfgmal\n Y3\n NaN NaN
file.txt1 NaN Y\n 30\n normal\n
Shop paramD paramE \
Fruit
New Text Document.txt1 SOMEdgdfg_ITEM_IN_ALL_CAPS\n 5 6 7 4448 9 \n
file.txt1 SOME_ITEM_IN_ALL_CAPS\n 5 6 7 8 9 \n
Shop paramF paramG pardamB \
Fruit
New Text Document.txt1 /dir/7y7456456to/stuff\n NaN 305456\n
file.txt1 /dir/to/stuff\n y NaN
Shop parfamG
Fruit
New Text Document.txt1 y33333333333333333333333
file.txt1 NaN
列表列表是paramMatrix
,在运行doPivot函数之前看起来像这样:
[['file.txt1', 'paramA', 'Y\n'], ['file.txt1', 'paramB', '30\n'], ['file.txt1', 'paramC', 'normal\n'], ['file.txt1', 'paramD', 'SOME_ITEM_IN_ALL_CAPS\n'], ['file.txt1', 'paramE', '5 6 7 8 9 \n'], ['file.txt1', 'paramF', '/dir/to/stuff\n'], ['file.txt1', 'paramG', 'y'], ['New Text Document.txt1', 'paramA', 'Y3\n'], ['New Text Document.txt1', 'pardamB', '305456\n'], ['New Text Document.txt1', 'paframC', 'nordfggdfgmal\n'], ['New Text Document.txt1', 'paramD', 'SOMEdgdfg_ITEM_IN_ALL_CAPS\n'], ['New Text Document.txt1', 'paramE', '5 6 7 4448 9 \n'], ['New Text Document.txt1', 'paramF', '/dir/7y7456456to/stuff\n'], ['New Text Document.txt1', 'parfamG', 'y33333333333333333333333']]
或以易于阅读的格式格式化:
[['file.txt1', 'paramA', 'Y\n']
, ['file.txt1', 'paramB', '30\n']
, ['file.txt1', 'paramC', 'normal\n']
, ['file.txt1', 'paramD', 'SOME_ITEM_IN_ALL_CAPS\n']
, ['file.txt1', 'paramE', '5 6 7 8 9 \n']
, ['file.txt1', 'paramF', '/dir/to/stuff\n']
, ['file.txt1', 'paramG', 'y']
, ['New Text Document.txt1', 'paramA', 'Y3\n']
, ['New Text Document.txt1', 'pardamB', '305456\n']
, ['New Text Document.txt1', 'paframC', 'nordfggdfgmal\n']
, ['New Text Document.txt1', 'paramD', 'SOMEdgdfg_ITEM_IN_ALL_CAPS\n']
, ['New Text Document.txt1', 'paramE', '5 6 7 4448 9 \n']
, ['New Text Document.txt1', 'paramF', '/dir/7y7456456to/stuff\n']
, ['New Text Document.txt1', 'parfamG', 'y33333333333333333333333']
]
doPivot
是来自here
我有一份清单清单:
+-------+--------+-----+
| file1 | parama | 112 |
| file1 | paramb | 54 |
| file1 | paramd | 234 |
| file2 | paramb | 63 |
| file2 | paramd | 334 |
| file2 | paramz | 11 |
| file3 | parama | 34 |
+-------+--------+-----+
我正试图将它转换为这样(标题行对我来说无关紧要):
+--------+-------+-------+-------+
| | File1 | File2 | File3 |
| parama | 112 | - | 34 |
| paramb | 54 | 63 | - |
| paramd | 234 | 334 | - |
| paramz | - | 11 | - |
+--------+-------+-------+-------+
在旋转列表列表时,我对pandas包的错误是什么?
请注意我已尝试[this approach as well][2]
;但是,我在这里遇到语法错误:
答案 0 :(得分:2)
在第一部分中,Shop
应该是索引,而不是Fruit
,如果您希望所需的结果与第二部分中的结果相似。如果是这种情况,您的代码效果很好。见下文。
import pandas as pd
list_ = [['file.txt1', 'paramA', 'Y\n']
, ['file.txt1', 'paramB', '30\n']
, ['file.txt1', 'paramC', 'normal\n']
, ['file.txt1', 'paramD', 'SOME_ITEM_IN_ALL_CAPS\n']
, ['file.txt1', 'paramE', '5 6 7 8 9 \n']
, ['file.txt1', 'paramF', '/dir/to/stuff\n']
, ['file.txt1', 'paramG', 'y']
, ['New Text Document.txt1', 'paramA', 'Y3\n']
, ['New Text Document.txt1', 'pardamB', '305456\n']
, ['New Text Document.txt1', 'paframC', 'nordfggdfgmal\n']
, ['New Text Document.txt1', 'paramD', 'SOMEdgdfg_ITEM_IN_ALL_CAPS\n']
, ['New Text Document.txt1', 'paramE', '5 6 7 4448 9 \n']
, ['New Text Document.txt1', 'paramF', '/dir/7y7456456to/stuff\n']
, ['New Text Document.txt1', 'parfamG', 'y33333333333333333333333']
]
df = pd.DataFrame(list_, columns=['File','Param','Values'])
df_p = df.pivot(index='Param',columns='File',values='Values')
print df_p
结果:
File New Text Document.txt1 file.txt1
Param
paframC nordfggdfgmal\n NaN
paramA Y3\n Y\n
paramB NaN 30\n
paramC NaN normal\n
paramD SOMEdgdfg_ITEM_IN_ALL_CAPS\n SOME_ITEM_IN_ALL_CAPS\n
paramE 5 6 7 4448 9 \n 5 6 7 8 9 \n
paramF /dir/7y7456456to/stuff\n /dir/to/stuff\n
paramG NaN y
pardamB 305456\n NaN
parfamG y33333333333333333333333 NaN
与第二部分类似的方法。
import pandas as pd
d = {'File': {0: 'file1',
1: 'file1',
2: 'file1',
3: 'file2',
4: 'file2',
5: 'file2',
6: 'file3'},
'Param': {0: 'parama',
1: 'paramb',
2: 'paramd',
3: 'paramb',
4: 'paramd',
5: 'paramz',
6: 'parama'},
'Values': {0: 112, 1: 54, 2: 234, 3: 63, 4: 334, 5: 11, 6: 34}}
df = pd.DataFrame.from_dict(d)
df_p = df.pivot(index='Param',columns='File',values='Values')
print df_p
同样的结果:
File file1 file2 file3
Param
parama 112 NaN 34
paramb 54 63 NaN
paramd 234 334 NaN
paramz NaN 11 NaN