如何强制from_records将字符串转换为数字?

时间:2019-05-07 08:26:37

标签: pandas

我有字典列表,其中包含数字作为字符串值。是否可以在类似于from_records的过程中将字符串转换为数字?

jdata = [{'a':1, 'b':'1'}, {'a':2, 'b':'3'}]
pd.DataFrame.from_records(jdata)
Out[129]: 
   a  b
0  1  1
1  2  3
df1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 2 columns):
a    2 non-null int64
b    2 non-null object
dtypes: int64(1), object(1)
memory usage: 112.0+ bytes

现在我hdf1 ['b'] = df1 ['b']。apply(pd.to_numeric)

df1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 2 columns):
a    2 non-null int64
b    2 non-null int64
dtypes: int64(2)
memory usage: 112.0 bytesave to do:

但是,如果数据太大,则应用和分配占用的内存将超过两倍。有什么方法可以让我们在数据帧构建期间完成转换?

1 个答案:

答案 0 :(得分:0)

使用列表理解:

jdata1 = [{k: int(v) for k, v in x.items()} for x in jdata]
df = pd.DataFrame.from_records(jdata1)

print (df.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 2 columns):
a    2 non-null int64
b    2 non-null int64
dtypes: int64(2)
memory usage: 112.0 bytes
None