我正在尝试编写一个用户定义的函数,该函数可以查看输入列的数据类型并对其进行更改。
我的输入数据类型将为int64,float64,object,datetimens [64]。
如果它是一个datetimens [64],那么我将空白日期替换为另一个自定义日期。输出数据类型也将为datetimens [64]
如果它是int64,float64或对象。我用字符串“ FILLINGTHENAS”代替空格,并将所有这些数据类型转换为对象。
def Change_Data_Type_DataFrame (AnyPandasDataFrame):
cr_date = datetime(1800,1,1,1,1,1)
for i in range(1, AnyPandasDataFrame.shape[1]):
Required_Column_Name = (AnyPandasDataFrame.columns[i])
Required_Data_Type = AnyPandasDataFrame[Required_Column_Name].dtype
if Required_Data_Type == 'datetime64[ns]':
DateChecker = True
else:
DateChecker = contains_word(Required_Column_Name, "Date","of Death","Day of Work")
if DateChecker == False :
if Required_Data_Type == 'int64':
print("Yes")
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].fillna("FILLINGTHENAS")
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].astype(str)
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].astype(str).replace('\.0', '', regex=True)
if Required_Data_Type == object:
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].fillna("FILLINGTHENAS")
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].astype(str)
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].astype(str).replace('\.0', '', regex=True)
if Required_Data_Type == 'float64':
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].fillna("FILLINGTHENAS")
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].astype(str)
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].astype(str).replace('\.0', '', regex=True)
else:
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].fillna(cr_date)
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].astype('datetime64[ns]')
return (AnyPandasDataFrame)
我有一个巨大的100列数据框,我的函数失败了,因为我在输出数据框中看到了int64。
打印-是的,它不起作用,但是我的df肯定具有int64 dtypes。
我要去哪里了,我的代码能写得更好吗?
请帮助我。
答案 0 :(得分:0)
我对代码进行了以下更改。
范围从1开始,我使其从0开始
我删除了多个if,并将其作为一个if逻辑
我再次替换后重新设置了数据类型,只是为了确保“熊猫没有将其重新设置”。
def Change_Data_Type_DataFrame (AnyPandasDataFrame):
cr_date = datetime(1800,1,1,1,1,1)
for i in range(0, AnyPandasDataFrame.shape[1]):
Required_Column_Name = (AnyPandasDataFrame.columns[i])
print(Required_Column_Name)
Required_Data_Type = AnyPandasDataFrame[Required_Column_Name].dtype
if Required_Data_Type == 'datetime64[ns]':
DateChecker = True
else:
DateChecker = contains_word(Required_Column_Name, "Date","of Death","Day of Work")
if DateChecker == False :
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].fillna("FILLINGTHENAS")
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].astype(str).replace('\.0', '', regex=True)
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].astype(str)
else:
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].fillna(cr_date)
AnyPandasDataFrame[Required_Column_Name] = AnyPandasDataFrame[Required_Column_Name].astype('datetime64[ns]')
return (AnyPandasDataFrame)