我有字典列表,其中包含数字作为字符串值。是否可以在类似于from_records的过程中将字符串转换为数字?
jdata = [{'a':1, 'b':'1'}, {'a':2, 'b':'3'}]
pd.DataFrame.from_records(jdata)
Out[129]:
a b
0 1 1
1 2 3
df1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 2 columns):
a 2 non-null int64
b 2 non-null object
dtypes: int64(1), object(1)
memory usage: 112.0+ bytes
现在我hdf1 ['b'] = df1 ['b']。apply(pd.to_numeric)
df1.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 2 columns):
a 2 non-null int64
b 2 non-null int64
dtypes: int64(2)
memory usage: 112.0 bytesave to do:
但是,如果数据太大,则应用和分配占用的内存将超过两倍。有什么方法可以让我们在数据帧构建期间完成转换?
答案 0 :(得分:0)
使用列表理解:
jdata1 = [{k: int(v) for k, v in x.items()} for x in jdata]
df = pd.DataFrame.from_records(jdata1)
print (df.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 2 columns):
a 2 non-null int64
b 2 non-null int64
dtypes: int64(2)
memory usage: 112.0 bytes
None