具有每列模式的多列的Pandas Fillna

时间:2017-03-18 04:42:03

标签: python pandas numpy data-science

使用人口普查数据时,我想用两列中的相应模式替换两列("工作类"和#34;原生国家")中的NaN。我可以轻松获得模式:

mode = df.filter(["workclass", "native-country"]).mode()

返回一个数据帧:

  workclass native-country
0   Private  United-States

然而,

df.filter(["workclass", "native-country"]).fillna(mode)

用任何东西替换每列中的NaN,更不用说与该列对应的模式了。有没有顺利的方法呢?

4 个答案:

答案 0 :(得分:8)

如果您希望在数据框mode的某些列中使用df来估算缺失值,则只需按Series cols = ["workclass", "native-country"] df[cols]=df[cols].fillna(df.mode().iloc[0]) df[cols]=df[cols].fillna(mode.iloc[0]) 创建df[cols]=df.filter(cols).fillna(mode.iloc[0]) 3}}:

df = pd.DataFrame({'workclass':['Private','Private',np.nan, 'another', np.nan],
                   'native-country':['United-States',np.nan,'Canada',np.nan,'United-States'],
                   'col':[2,3,7,8,9]})

print (df)
   col native-country workclass
0    2  United-States   Private
1    3            NaN   Private
2    7         Canada       NaN
3    8            NaN   another
4    9  United-States       NaN

mode = df.filter(["workclass", "native-country"]).mode()
print (mode)
  workclass native-country
0   Private  United-States

cols = ["workclass", "native-country"]
df[cols]=df[cols].fillna(df.mode().iloc[0])
print (df)
   col native-country workclass
0    2  United-States   Private
1    3  United-States   Private
2    7         Canada   Private
3    8  United-States   another
4    9  United-States   Private

或者:

.row {
  xdisplay: flex;
}

.row-group {
  display: flex;
  flex-direction: row !important;
  align-items: center;
}

  label {
    flex: 0 0 100px;
  }

  input {
    width: 100%;
    box-sizing: border-box;
    flex: 1;
  }

.form-group {
  flex: 1 0 0;
  flex-direction: column;
}

您的解决方案:

<input type=text onkeyup="this.value = this.value.replace(/[^a-z\d]+/ig, '')">

样品:

class Rnn extends Component{
 constructor(props){
 super(props);
 this.state={
 x:[]
}
}

componentWillMount(){
 var api={
 getroverss(){
 var url="https://fcctop100.herokuapp.com/api/fccusers/top/recent";
 return fetch(url).then((res) => res.json());
}
},
 api.getrovers().then((res) =>{
 this.setState({ x: res
 })
 });
 }
 render(){
 var xx=this.state.x;
 console.table(xx);
 return (
  <h1>print data</h1>
  );
 }}

答案 1 :(得分:2)

你可以这样做:

df[["workclass", "native-country"]]=df[["workclass", "native-country"]].fillna(value=mode.iloc[0])

例如,

    import pandas as pd
d={
    'key3': [1,4,4,4,5],
    'key2': [6,6,4],
    'key1': [6,4,4],
}

df=pd.DataFrame.from_dict(d,orient='index').transpose()

然后df

  key3  key2    key1
0   1   6       6
1   4   6       4
2   4   4       4
3   4   NaN     NaN
4   5   NaN     NaN

然后做:

l=df.filter(["key1", "key2"]).mode()
df[["key1", "key2"]]=df[["key1", "key2"]].fillna(value=l.iloc[0])

我们得到df

  key3  key2    key1
0   1   6        6
1   4   6        4
2   4   4        4
3   4   6        4
4   5   6        4

答案 2 :(得分:0)

我认为使用dict作为fillna参数“值”是最干净的

ref:https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html

根据@ miriam-farber的回复创建玩具df

import pandas as pd
d={
    'key3': [1,4,4,4,5],
    'key2': [6,6,4],
    'key1': [6,4,4],
}

d_df=pd.DataFrame.from_dict(d,orient='index').transpose()

创建字典

mode_dict = d_df.loc[:,['key2','key1']].mode().to_dict('records')[0]

在fillna方法中使用此命令

d_df.fillna(mode_dict, inplace=True)

答案 3 :(得分:0)

此代码将inute列的值归为int,将object列的模式归为一元,从而列出两种类型的列,并根据条件归入缺失值。

cateogry_columns=df.select_dtypes(include=['object']).columns.tolist()
integer_columns=df.select_dtypes(include=['int64','float64']).columns.tolist()

for column in df:
    if df[column].isnull().any():
        if(column in cateogry_columns):
            df[column]=df[column].fillna(df[column].mode()[0])
        else:
            df[column]=df[column].fillna(df[column].mean)`