因此,我尝试用np.nan
替换数据框中的None
值,并注意到在此过程中,数据框中的float
列的数据类型已更改为object
即使它们不包含任何丢失的数据。
例如:
import pandas as pd
import numpy as np
data = pd.DataFrame({'A':np.nan,'B':1.096, 'C':1}, index=[0])
data.replace(to_replace={np.nan:None}, inplace=True)
在调用data.dtypes
之前和之后,对replace
的调用显示列B的数据类型从float更改为object,而C的数据类型保持为int。
如果我从原始数据中删除A列,则不会发生。
我想知道为什么会发生这种变化,以及如何避免这种影响。
答案 0 :(得分:0)
当您替换每列并从<div class="container">
<div class="row justify-content-center">
<div class="col text-center">
<!-- Button trigger modal -->
<button type="button" class="btn btn-primary margin-t newButton" data-toggle="modal" data-target="#newUserModal">
Add user
</button>
</div>
</div>
</div>
<!-- New User Modal -->
<div class="modal fade" id="newUserModal" tabindex="-1" role="dialog" aria-labelledby="newUserModalLabel" aria-hidden="true">
<div class="modal-dialog modal-lg" role="document">
<div class="modal-content">
<div class="modal-header">
<h5 class="modal-title" id="newUserModalLabel">New user</h5>
<button type="button" class="close" data-dismiss="modal" aria-label="Close">
<span aria-hidden="true">×</span>
</button>
</div>
<div class="modal-body">
<form role="form" method="post" id="new-user-form" class="needs-validation" action="<?= base_url(); ?>test/newUser" novalidate>
<div class="form-row">
<div class="col-md-6 mb-3">
<label for="inputFirstName">First name</label>
<input type="text" class="form-control" name="inputFirstName" id="inputFirstName" placeholder="" required>
<div class="invalid-feedback">
Invalid input
</div>
</div>
<div class="col-md-6 mb-3">
<label for="inputLastName">Last name</label>
<input type="text" class="form-control" name="inputLastName" id="inputLastName" placeholder="" required>
<div class="invalid-feedback">
Invalid input
</div>
</div>
</div>
<div class="form-row">
<div class="col-md-4 mb-3">
<div class="avatar-upload">
<div class="avatar-edit">
<input type='file' name="index" id="indexImageUpload" accept=".png, .jpg, .jpeg" />
<label class="text-center" for="indexImageUpload"></label>
</div>
<div class="avatar-preview">
<div id="indexImage" style="background-image: url(https://ryanacademy.ie/wp-content/uploads/2017/04/user-placeholder.png)">
</div>
</div>
</div>
</div>
<div class="col-md-4 mb-3">
<div class="avatar-upload">
<div class="avatar-edit">
<input type='file' name="picture1" id="picture1Upload" accept=".png, .jpg, .jpeg" />
<label class="text-center" for="picture1Upload"></label>
</div>
<div class="avatar-preview">
<div id="picture1" style="background-image: url(https://ryanacademy.ie/wp-content/uploads/2017/04/user-placeholder.png)">
</div>
</div>
</div>
</div>
<div class="col-md-4 mb-3">
<div class="avatar-upload">
<div class="avatar-edit">
<input type='file' name="picture2" id="picture2Upload" accept=".png, .jpg, .jpeg" />
<label class="text-center" for="picture2Upload"></label>
</div>
<div class="avatar-preview">
<div id="picture2" style="background-image: url(https://ryanacademy.ie/wp-content/uploads/2017/04/user-placeholder.png)">
</div>
</div>
</div>
</div>
</div>
</form>
</div>
<div class="modal-footer">
<button type="button" class="btn btn-secondary closeButton" data-dismiss="modal">Close</button>
<button type="submit" class="btn btn-primary" form="new-user-form">Save</button>
</div>
</div>
</div>
</div>
而不是replace
调用pd.Series(...)
时,效果很好。
除了注释pd.DataFrame(...)
中所述,不能将其强制转换为浮点数(或int或任何数字-您宁愿使用NoneType()
),因此它将被自动强制转换为{{1 }}。
NaN
输出:
object
答案 1 :(得分:0)
我已经遇到过很多次了,并且有一个解决方法。在使用astype(object)替换之前,它将保留dtype。我不得不将其用于合并问题,合并问题等。我不确定为什么以这种方式使用时会保留类型,但确实如此,一旦找到它就很有用。
data.info()
#<class 'pandas.core.frame.DataFrame'>
#Int64Index: 1 entries, 0 to 0
#Data columns (total 3 columns):
#A 0 non-null float64
#B 1 non-null float64
#C 1 non-null int64
#dtypes: float64(2), int64(1)
#memory usage: 32.0 bytes
import pandas as pd
import numpy as np
data = pd.DataFrame({'A':np.nan,'B':1.096, 'C':1}, index=[0])
data.replace(to_replace={np.nan:None}, inplace=True)
data.info()
#<class 'pandas.core.frame.DataFrame'>
#Int64Index: 1 entries, 0 to 0
#Data columns (total 3 columns):
#A 0 non-null object
#B 1 non-null object
#C 1 non-null int64
#dtypes: int64(1), object(2)
#memory usage: 32.0+ bytes
import pandas as pd
import numpy as np
data = pd.DataFrame({'A':np.nan,'B':1.096, 'C':1}, index=[0])
data.astype(object).replace(to_replace={np.nan:None}, inplace=True)
data.info()
#<class 'pandas.core.frame.DataFrame'>
#Int64Index: 1 entries, 0 to 0
#Data columns (total 3 columns):
#A 0 non-null float64
#B 1 non-null float64
#C 1 non-null int64
#dtypes: float64(2), int64(1)
#memory usage: 32.0 bytes