如何更改数据框

时间:2016-11-30 04:48:32

标签: python pandas dataframe

我想更改数据框中所有值的大写字母,并使用以下代码

import pandas as pd
import numpy as np

path1= "C:\\Users\\IBM_ADMIN\\Desktop\\ml-1m\\SELECT_FROM_HRAP2P3_SAAS_ZTXDMPARAM_201611291745.csv"
frame1 = pd.read_csv(path1,encoding='utf8',dtype = {'COUNTRY_CODE': str})
for x in frame1:
    frame1[x] = frame1[x].str.lower()
frame1

但是这行有以下错误:

 frame1[x] = frame1[x].str.lower()

错误:

AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

不知道原因,

3 个答案:

答案 0 :(得分:1)

您可以使用applymap功能。

import pandas as pd

df1 = pd.DataFrame({'MovieName': ['LIGHTS OUT', 'Legend'], 'Actors':['MARIA Bello', 'Tom Hard']})
df2=df1.applymap(lambda x: x.lower())
print df1, "\n"
print df2

输出:

        Actors   MovieName
0  MARIA Bello  LIGHTS OUT
1     Tom Hard      Legend 

        Actors   MovieName
0  maria bello  lights out
1     tom hard      legend

答案 1 :(得分:0)

尝试在Series对象上使用str.lower

支持您的DataFrame,如下所示:

df = pd.DataFrame(dict(name=["HERE", "We", "are"]))

   name
0  HERE
1    We
2   are

然后降低所有值并输出:

df['name'] = df['name'].str.lower()

   name
0  here
1    we
2   are

答案 2 :(得分:0)

你可以试试这个:

df2 = pd.DataFrame({ 'A' : 1.,
                   'B' : pd.Timestamp('20130102'),
                  'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                   'D' : np.array([3] * 4,dtype='int32'),
                   'E' : pd.Series(["TEST","Train","test","train"]),
                    'F' : 'foo' })

mylist = list(df2.select_dtypes(include=['object']).columns)  # in dataframe    
                                                 #string is stored as object

for i in mylist:
   df2[i]= df2[i].str.lower()