使用人口普查数据时,我想用两列中的相应模式替换两列("工作类"和#34;原生国家")中的NaN。我可以轻松获得模式:
mode = df.filter(["workclass", "native-country"]).mode()
返回一个数据帧:
workclass native-country
0 Private United-States
然而,
df.filter(["workclass", "native-country"]).fillna(mode)
不用任何东西替换每列中的NaN,更不用说与该列对应的模式了。有没有顺利的方法呢?
答案 0 :(得分:8)
如果您希望在数据框mode
的某些列中使用df
来估算缺失值,则只需按Series
cols = ["workclass", "native-country"]
df[cols]=df[cols].fillna(df.mode().iloc[0])
按df[cols]=df[cols].fillna(mode.iloc[0])
创建df[cols]=df.filter(cols).fillna(mode.iloc[0])
3}}:
df = pd.DataFrame({'workclass':['Private','Private',np.nan, 'another', np.nan],
'native-country':['United-States',np.nan,'Canada',np.nan,'United-States'],
'col':[2,3,7,8,9]})
print (df)
col native-country workclass
0 2 United-States Private
1 3 NaN Private
2 7 Canada NaN
3 8 NaN another
4 9 United-States NaN
mode = df.filter(["workclass", "native-country"]).mode()
print (mode)
workclass native-country
0 Private United-States
cols = ["workclass", "native-country"]
df[cols]=df[cols].fillna(df.mode().iloc[0])
print (df)
col native-country workclass
0 2 United-States Private
1 3 United-States Private
2 7 Canada Private
3 8 United-States another
4 9 United-States Private
或者:
.row {
xdisplay: flex;
}
.row-group {
display: flex;
flex-direction: row !important;
align-items: center;
}
label {
flex: 0 0 100px;
}
input {
width: 100%;
box-sizing: border-box;
flex: 1;
}
.form-group {
flex: 1 0 0;
flex-direction: column;
}
您的解决方案:
<input type=text onkeyup="this.value = this.value.replace(/[^a-z\d]+/ig, '')">
样品:
class Rnn extends Component{
constructor(props){
super(props);
this.state={
x:[]
}
}
componentWillMount(){
var api={
getroverss(){
var url="https://fcctop100.herokuapp.com/api/fccusers/top/recent";
return fetch(url).then((res) => res.json());
}
},
api.getrovers().then((res) =>{
this.setState({ x: res
})
});
}
render(){
var xx=this.state.x;
console.table(xx);
return (
<h1>print data</h1>
);
}}
答案 1 :(得分:2)
你可以这样做:
df[["workclass", "native-country"]]=df[["workclass", "native-country"]].fillna(value=mode.iloc[0])
例如,
import pandas as pd
d={
'key3': [1,4,4,4,5],
'key2': [6,6,4],
'key1': [6,4,4],
}
df=pd.DataFrame.from_dict(d,orient='index').transpose()
然后df
是
key3 key2 key1
0 1 6 6
1 4 6 4
2 4 4 4
3 4 NaN NaN
4 5 NaN NaN
然后做:
l=df.filter(["key1", "key2"]).mode()
df[["key1", "key2"]]=df[["key1", "key2"]].fillna(value=l.iloc[0])
我们得到df
key3 key2 key1
0 1 6 6
1 4 6 4
2 4 4 4
3 4 6 4
4 5 6 4
答案 2 :(得分:0)
我认为使用dict作为fillna参数“值”是最干净的
ref:https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html
根据@ miriam-farber的回复创建玩具df
import pandas as pd
d={
'key3': [1,4,4,4,5],
'key2': [6,6,4],
'key1': [6,4,4],
}
d_df=pd.DataFrame.from_dict(d,orient='index').transpose()
创建字典
mode_dict = d_df.loc[:,['key2','key1']].mode().to_dict('records')[0]
在fillna方法中使用此命令
d_df.fillna(mode_dict, inplace=True)
答案 3 :(得分:0)
此代码将inute列的值归为int,将object列的模式归为一元,从而列出两种类型的列,并根据条件归入缺失值。
cateogry_columns=df.select_dtypes(include=['object']).columns.tolist()
integer_columns=df.select_dtypes(include=['int64','float64']).columns.tolist()
for column in df:
if df[column].isnull().any():
if(column in cateogry_columns):
df[column]=df[column].fillna(df[column].mode()[0])
else:
df[column]=df[column].fillna(df[column].mean)`