熊猫如何水平打印(单热编码)

时间:2019-04-14 21:13:46

标签: python pandas one-hot-encoding

我想更改一个热门编码熊猫脚本的格式。我想将其从垂直打印具有三个索引的输出更改为水平打印具有一个索引的输出。代码和输出都在下面。而且,如果有可能,我希望在各列之间留出空格以分隔它们。

Code:

from random import randint
import pandas_datareader.data as web
import pandas as pd
import datetime 
import itertools as it
import numpy as np
import csv

df = pd.read_csv('Filename')
df.columns = ['Date','b1','b2','b3']
df = df.set_index('Date')

reversed_df = df.iloc[::-1]

BallOne = pd.get_dummies(reversed_df.b1[:5])
BallTwo = pd.get_dummies(reversed_df.b2[:5])
BallThree = pd.get_dummies(reversed_df.b3[:5])
print(BallOne,("\n"))
print(BallTwo,("\n"))
print(BallThree,("\n"))

Output:
            2  5  6  8
Date                  
1996-12-16  0  0  1  0
1996-12-17  0  0  0  1
1996-12-18  0  1  0  0
1996-12-19  1  0  0  0
1996-12-20  0  0  1  0 

            3  5  8  9
Date                  
1996-12-16  0  1  0  0
1996-12-17  0  0  0  1
1996-12-18  0  1  0  0
1996-12-19  1  0  0  0
1996-12-20  0  0  1  0 

            1  5  7  9
Date                  
1996-12-16  0  0  0  1
1996-12-17  1  0  0  0
1996-12-18  0  0  1  0
1996-12-19  0  1  0  0
1996-12-20  0  0  0  1

将输出更改为此:

            2  5  6  8        3  5  8  9        1  5  7  9 
Date                  
1996-12-16  0  0  1  0        0  1  0  0        0  0  0  1 
1996-12-17  0  0  0  1        0  0  0  1        1  0  0  0 
1996-12-18  0  1  0  0        0  1  0  0        0  0  1  0 
1996-12-19  1  0  0  0        1  0  0  0        0  1  0  0 
1996-12-20  0  0  1  0        0  0  1  0        0  0  0  1

1 个答案:

答案 0 :(得分:0)

您可以在此处使用pandas.concat()

import pandas as pd

df_1 = pd.DataFrame({1: [0, 1, 0, 1, 0], 7: [0, 1, 0, 0 , 0]}, index = pd.date_range('2019-01-01', '2019-01-05'))

df_2 = pd.DataFrame({2: [0, 1, 1, 1, 0], 7: [0, 1, 1, 1 , 1]}, index = pd.date_range('2019-01-01', '2019-01-05'))

print(pd.concat([df_1, df_2], axis = 1))

礼物:

            1  7  2  7
2019-01-01  0  0  0  0
2019-01-02  1  1  1  1
2019-01-03  0  0  1  1
2019-01-04  1  0  1  1
2019-01-05  0  0  0  1

使用您提供的数据,有一些重复的列标签。解决此问题的一种方法是使用keys

print(pd.concat([df_1, df_2], keys = ['df_1', 'df_2'], axis = 1))

礼物:

           df_1    df_2   
              1  7    2  7
2019-01-01    0  0    0  0
2019-01-02    1  1    1  1
2019-01-03    0  0    1  1
2019-01-04    1  0    1  1
2019-01-05    0  0    0  1