I have an html table that I want to convert to a pandas dataframe. I'm using pandas.read_html()
and it works alright except it reads in the numbers in my table as strings.
table = html_table_here
table = '<table border="1" class="dataframe">\n '+table+' \n</table>'
df_table = pandas.read_html(table,header=None,index_col=0)
df = pandas.concat(df_table)
And doing print df["TASK_ID"][0]
returns "3"
instead of 3
.
Is there anyway to preserve the type of the values in an html table when converting to a pandas dataframe?
答案 0 :(得分:0)
You have to explicitly convert those strings to numerical dtypes as it can't tell what the dtypes are when they're str in html, so calling convert_objects
should work:
df_table = df_table.convert_objects(convert_numeric=True)