将值从多列传播到单列-Pandas

时间:2019-05-03 08:27:30

标签: pandas

 col1   col2    col3   combined
----------------------------
val1                   val1
val1                   val1
NaN                    val1
val1                   val1
       val2            val2
       NaN             val2
       val2            val2
              val3     val3
              NaN      val3
              val3     val3 

output:
-------
col1   col2    col3   combined
----------------------------
val1                   val1
val1                   val1
NaN                    NaN
val1                   val1
       val2            val2
       NaN             NaN
       val2            val2
              val3     val3
              NaN      NaN
              val3     val3

我有列,并且我必须检查一列中是否存在任何NaN值,即使使用熊猫存在值中,也必须在合并的列中进行更新。

i am using the follwing code:
cols = df[0:len(df.columns)-1]
for col in cols:
    print (col)
    df.combined = df.combined.fillna(value=df[col])

但该值未更改。

df.T.bfill().iloc[-1]

如果我使用填充,即使存在NaN,它也会填充值。

2 个答案:

答案 0 :(得分:2)

np.whereisnasum一起使用

# Change 1 to 3 if the blank space is None or NaN thanks to @Mohit Motwani
df['combined'] = np.where(df.isna().sum(axis=1) >= 1, np.nan, df.combined)

df
Out[34]: 
   col1  col2  col3 combined
0  val1                 val1
1  val1                 val1
2   NaN                  NaN
3  val1                 val1
4        val2           val2
5         NaN            NaN
6        val2           val2
7              val3     val3
8               NaN      NaN
9              val3     val3

答案 1 :(得分:0)

我遍历行并使用isna()查找NaN,并在“组合”列中为NaN分配相应的索引。

import pandas as pd
import numpy as np

### Generate sample data
arr = np.zeros((9,3))
comb = np.zeros(9)
for i in range(3):
    val = np.random.randint(-5,5)
    for ji in range(i*3,i*3+3):
        arr[ji,i] = val
    a_rand_row = np.random.randint(i*3,i*3+3)
    arr[a_rand_row,i] = np.nan

    comb[i*3:i*3+3] = val
    comb[a_rand_row] = val

init_cols = ["col1","col2","col3"]
df = pd.DataFrame(arr, columns=init_cols)
df["comb"] = comb

### iterate over columns and set comb to nan if column is nan
for col in init_cols:
    df["comb"][df[col].isna()] = np.nan