使用Pandas透视数据帧

时间:2016-07-06 07:05:45

标签: python numpy pandas

我通过pandas.read_html

在线获取了下表
      Column0    Column1      Column2      Column3
0     Entry_1    0.685        Record_1     0.69-S$ 0.685
1     Entry_2    0.036        Record_2     0.685
2     Entry_3    05/Jul/2016  Record_3     0.72-S$ 0.4
3     Entry_4    0.338        Record_4     178.8 mm
4     Entry_5    0.41         Record_5     0.06
5     Entry_6    122.48       Record_6     17.29%
6     Entry_7    0.5          Record_7     0.58 as of 05/Jul/2016

如何对此数据进行pviot /转置,使Column 0成为标题,Column 1成为值。同样适用于Column 2Column 3

3 个答案:

答案 0 :(得分:2)

这可能是解决此问题的最简单方法。无论如何,我最简单的方法就是拿出来。

Private Sub Worksheet_Change(ByVal Target As Range)
    If Target.Column = 2 Then
        Application.EnableEvents = False
        Dim Ret_type As Integer
        Dim strMsg As String
        Dim strTitle As String
        strMsg = "Do you approve?" & vbCrLf & "Warning: This action will lock the current row."
        strTitle = "Approval"
        Ret_type = MsgBox(strMsg, vbYesNo + vbQuestion, strTitle)
        Select Case Ret_type
            Case 7
                MsgBox "Your input will be deleted."
                Target.Clear
                Application.EnableEvents = True
                Exit Sub
            Case 6
                ActiveSheet.Unprotect Password:="password"
                Target.Rows.Locked = False
                Cells(Target.Row, 3).Value = Date + Time
                'Application.EnableEvents = True
                Target.EntireRow.Locked = True
                ActiveSheet.Protect Password:="password"
                Application.EnableEvents = True
        End Select
    End If
End Sub

答案 1 :(得分:1)

您可以使用lreshape创建新列,然后使用Col set_indexT转置,最后按rename_axis删除列名称pandas 0.18.0):

print (pd.lreshape(df, {'Col':['Column0', 'Column2'], 
                        0:['Column1', 'Column3']})
         .set_index('Col')
         .T
         .rename_axis(None, axis=1))

  Entry_1 Entry_2      Entry_3 Entry_4 Entry_5 Entry_6 Entry_7       Record_1  \
0   0.685   0.036  05/Jul/2016   0.338    0.41  122.48     0.5  0.69-S$ 0.685   

  Record_2     Record_3  Record_4 Record_5 Record_6                Record_7  
0    0.685  0.72-S$ 0.4  178.8 mm     0.06   17.29%  0.58 as of 05/Jul/2016  

答案 2 :(得分:0)

我建议您使用方法DataFrame.pivot作为以下示例:

import pandas as pd
import numpy as np
import pandas.util.testing as tm; tm.N = 3
def unpivot(frame):
    N, K = frame.shape
    data = {'value' : frame.values.ravel('F'),
            'variable' : np.asarray(frame.columns).repeat(N),
            'date' : np.tile(np.asarray(frame.index), K)}
    return pd.DataFrame(data, columns=['date', 'variable', 'value'])
df = unpivot(tm.makeTimeDataFrame())
print (df)

print (df.pivot(index='date', columns='variable', values='value'))
  

打印(DF)

          日期变量值
  0 2000-01-03 A 0.101495
  1 2000-01-04 A -0.554863
  2 2000-01-05 A -0.345271
  3 2000-01-03 B -1.104909
  4 2000-01-04 B -0.723819
  5 2000-01-05 B 0.088401
  6 2000-01-03 C 1.495768
  7 2000-01-04 C -0.756166
  8 2000-01-05 C -0.266072
  9 2000-01-03 D 0.451050
  10 2000-01-04 D -1.457763
  11 2000-01-05 D 0.945552

     

打印(df.pivot(索引='日期',列='变量',值='值'))

  变量A B C D
  日期
  2000-01-03 2.932572 -1.959961 0.385705 -1.629831
  2000-01-04 -0.317548 0.031041 2.129526 -1.717546
  2000-01-05 0.108186 1.182527 0.997716 0.453127