如何将数据框的列从float更改为整数(Pandas)

时间:2017-03-09 18:25:58

标签: python pandas dataframe

所以我试图按单元格编辑整个列单元格,将列从包含整数和字符串的内容更改为整数组件。

数据框中的实际列:

0                           11212; xxxxxxxxxx xxxxxxxx   
1                           11212; xxxxxxxxxx xxxxxxxx   
2                           11212; xxxxxxxxxx xxxxxxxx   
3                           11212; xxxxxxxxxx xxxxxxxx     
8                  667788; xxxxxxx xxxxxxxxxxxxx xxxxxx   
9                  55555; xxxxxxx xxxxxxxxxxxxx xxxxxx   
10                 55555; xxxxxxx xxxxxxxxxxxxx xxxxxx   
11                 55555; xxxxxxx xxxxxxxxxxxxx xxxxxx   
12                 33333; xxxxxxx xxxxxxxxxxxxx xxxxxx   
13                 333; xxx xxxxx @ xxx xxx 2 xxxx   
14                 9991; xxxx; xxxxxx xxxxx xxxx @ 2 xxx   
18                       1635; vvvvvvvvvvvv vvvvvv 10   
19                       1635; vvvvvvvvvvvv vvvvvv 10   
20                       1635; vvvvvvvvvvvv vvvvvv 10   
21                       1635; vvvvvvvvvvvv vvvvvv 10     
32                       1712; Cxxxx xxxxxxxx; xxx 0   
33                       1712; Cxxxx xxxxxxxx; xxx 0   
34                       1712; Cxxxx xxxxxxxx; xxx 0   
35                       1712; Cxxxx xxxxxxxx; xxx 0

这是我正在运行的代码

 import pandas as pd 
    import re

    # import excel file from Trello
    xlsx = pd.ExcelFile("/home/deon/Documents/Work_Stuff/Trello.xls") 
    # create data frame from excel file on sheet 1
    df2 = pd.read_excel(xlsx,'Sheet1')
    df3 = pd.DataFrame(data=df2)

    # delete columns not relative to us
    df3.drop(df3.columns[[0,5,10,11]],inplace=True,axis=1)
    df3.columns= "Date*", "Due date", "Week*", "Card", "Board", "List", "S", "E 1st"

    df3[:, 6] = df3.iloc[:,6].apply(lambda x: x.split(';')[0]) 
    print df2.head()


# Also tried
    digits = df3.iloc[:, 4].apply(lambda x: re.findall('\d+', str(x)))
    df3.iloc[:, 4] = digits.str.get(0).astype(int)
    print df3.head()

1 个答案:

答案 0 :(得分:0)

你有分裂字符串的一般想法,在引用数据帧时遇到了麻烦。更多的东西:

<强>代码:

df['number'] = df.raw_string.apply(lambda x: int(x.split(';')[0]))

测试代码:

data = [x.strip() for x in """
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
             11212; xxxxxxxxxx xxxxxxxx
    667788; xxxxxxx xxxxxxxxxxxxx xxxxxx
    55555; xxxxxxx xxxxxxxxxxxxx xxxxxx
    55555; xxxxxxx xxxxxxxxxxxxx xxxxxx
""".split('\n')[1:-1]]

import pandas as pd
df = pd.DataFrame(data=data, columns=['raw_string'])

df['number'] = df.raw_string.apply(lambda x: int(x.split(';')[0]))

print(df.head())

<强>结果:

                             raw_string  number
0            11212; xxxxxxxxxx xxxxxxxx   11212
1            11212; xxxxxxxxxx xxxxxxxx   11212
2            11212; xxxxxxxxxx xxxxxxxx   11212
3            11212; xxxxxxxxxx xxxxxxxx   11212
4  667788; xxxxxxx xxxxxxxxxxxxx xxxxxx  667788