I'm tearing my hair out a bit with this one. I've imported two csv's into pandas dataframes both have a column called SiteReference i want to use pd.merge to join dataframes using SiteReference as a key.
Initial merged failed as pd.read took different interpretations of the SiteReference values, in one instance 380500145.0 in the other 380500145 both stored as objects. I ran Regex to clean the columns and then pd.to_numeric, this resulted in one value of 380500145.0 and another of 3.805001e+10. They should both be 380500145. I then attempted;
df['SiteReference'] = df['SiteReference'].astype(int).astype('str')
But got back;
ValueError: cannot convert float NaN to integer
How can i control how pandas is dealing with these, preferably on import?
答案 0 :(得分:0)
Following the discussion in the comments, if you want to format floats as integer strings, you can use this:
df['SiteReference'] = df['SiteReference'].map('{:,.0f}'.format)
This should handle null values gracefully.
答案 1 :(得分:0)
Perharps最好的解决方案是避免pd.read
影响此字段的类型:
df=pd.read_csv('data.csv',sep=',',dtype={'SiteReference':str})