如何分割特征和标签

时间:2019-07-24 01:52:24

标签: python pandas numpy

我想将前50列用作特征X,将最后一列用作标签y,我该怎么做?数据在这里

我已经使用:

import pandas as pd
df = pd.read_csv('file.csv', sep=' ', header=None)

第1行:

6.999299526214599609e+00 -4.579982161521911621e-01 6.291269779205322266e+00 3.196178436279296875e+00 -5.663570880889892578e+00 -1.810324430465698242e+00 -6.706712245941162109e+00 -1.486396908760070801e+00 7.831575274467468262e-01 2.103844642639160156e+00 1.438934803009033203e+00 1.163767457008361816e+00 -4.729847431182861328e+00 2.073661834001541138e-01 -3.499572992324829102e+00 7.331941604614257812e+00 5.259800434112548828e+00 3.068963885307312012e-01 4.826724827289581299e-01 2.915471076965332031e+00 -1.563049554824829102e+00 4.521403312683105469e+00 2.377167463302612305e+00 1.402835369110107422e+00 -6.507210731506347656e+00 1.661594510078430176e+00 3.218852043151855469e+00 2.605128288269042969e+00 -6.348329782485961914e-01 -1.768920421600341797e+00 3.369244933128356934e-01 -9.721876144409179688e+00 -3.150746524333953857e-01 -6.363586187362670898e-01 7.596837520599365234e+00 -2.103782415390014648e+00 2.669518947601318359e+00 2.815987110137939453e+00 3.098936080932617188e+00 -2.445043325424194336e+00 4.101460456848144531e+00 1.029265499114990234e+01 -3.425651788711547852e+00 -7.059376239776611328e+00 2.968243837356567383e+00 1.735906600952148438e+00 -5.084319591522216797e+00 -4.689389228820800781e+00 -5.318581685423851013e-02 7.332663059234619141e+00 0.000000000000000000e+00

第2行:

-3.312762498855590820e+00 -6.952639102935791016e+00 4.057536602020263672e+00 -7.067280411720275879e-01 1.559423655271530151e-01 -2.063135862350463867e+00 3.473832607269287109e+00 -6.821436405181884766e+00 1.913890987634658813e-01 1.051760554313659668e+00 4.264380037784576416e-01 -1.163577362895011902e-01 -1.162586688995361328e+01 -4.555134773254394531e+00 -2.115072965621948242e+00 2.407418012619018555e+00 4.216342449188232422e+00 7.753645896911621094e+00 1.841859579086303711e+00 1.306602835655212402e+00 6.301051616668701172e+00 5.308498382568359375e+00 2.542440891265869141e+00 -2.183512926101684570e+00 5.020323753356933594e+00 9.936455488204956055e-01 1.112178325653076172e+00 1.701865315437316895e+00 -9.683893322944641113e-01 6.330366134643554688e+00 -3.132382631301879883e+00 -6.258290767669677734e+00 -4.719416141510009766e+00 2.254427433013916016e+00 7.009744644165039062e+00 2.768572807312011719e+00 7.527151107788085938e-01 2.256974935531616211e+00 2.509361505508422852e+00 -8.301359176635742188e+00 -4.890173971652984619e-01 7.536663860082626343e-02 -2.915276050567626953e+00 -3.129587650299072266e+00 -2.917083024978637695e+00 1.627410769462585449e+00 -1.588313817977905273e+00 -5.896830558776855469e+00 3.898775339126586914e+00 -5.005477428436279297e+00 0.000000000000000000e+00

2 个答案:

答案 0 :(得分:2)

尝试使用此代码:

x, y = df.iloc[:, :-1], df.iloc[:, [-1]]

答案 1 :(得分:0)

通常我们会

y=df.iloc[:,[-1]] # and notice here is still data frame , when you convert to the training table , please make sure adding `ravel` at the end  
x=df.drop(y.columns,axis = 1)