df = pd.DataFrame( data = {"CouponCode": ["1","2","3","Winter14","5"] } )
I want to take the above data frame and get the below dataframe. I want all the numbers in the CouponCode column to be converted to floats or ints and if the value is not a number I want to put it in a new column name couponName (I put in "nan" as strings, but of course I want them to be 'real' nulls).
df_new = pd.DataFrame(data = {"CouponCode": [1,2,3,"Winter14",5], "CouponName": ["nan", "nan", "nan", "Winter14", "nan"]} )
答案 0 :(得分:3)
plan
pd.to_numeric
applied to every individual element of 'CouponCode'
. This is slow! Why do I do it? Because it's the only way I know of to ensure that int
gets mapped to int
, float
to float
, and str
to str
. dtype
of resultant column will be object
. I could have done a pd.to_numeric
on the entire column with errors='coerce'
, the problem is that the nan
would convert int
to float
..str.is_numeric
to determine which are strings, and port those over to the new column.df.loc[~df.CouponCode.astype(str).str.isnumeric(), 'CouponName'] = df.CouponCode
df.loc[:, 'CouponCode'] = df.CouponCode.apply(pd.to_numeric, errors='ignore')
df
CouponCode CouponName
0 1 NaN
1 2 NaN
2 3 NaN
3 Winter14 Winter14
4 5 NaN
df.to_dict('list')
{'CouponCode': [1, 2, 3, 'Winter14', 5],
'CouponName': [nan, nan, nan, 'Winter14', nan]}
答案 1 :(得分:2)
Use .str
accessor with isnumeric()
:
df['CouponName']=np.where(~df.CouponCode.str.isnumeric(),df.CouponCode,np.nan)
Output:
CouponCode CouponName
0 1 NaN
1 2 NaN
2 3 NaN
3 Winter14 Winter14
4 5 NaN