Pandas:选择列,默认情况下不存在

时间:2018-03-16 21:01:16

标签: python pandas dataframe

假设我有以下DataFrame:

>>> df
     val1 val2 val3
key
  1     1    1    1
  2     2    2    2
  3     3    3    3

现在我要选择val1val2列,以及(这里是踢球者:) val4

>>> df[["val1", "val2", "val4"]]
KeyError: "['val4'] not in index"

我想要的是什么:

>>> df.something(something)
     val1 val2 val4
key
  1     1    1  NaN
  2     2    2  NaN
  3     3    3  NaN

2 个答案:

答案 0 :(得分:3)

IIUC reindex

df.reindex(columns=["val1", "val2", "val4"])
Out[431]: 
     val1  val2  val4
key                  
1       1     1   NaN
2       2     2   NaN
3       3     3   NaN

同样.loc可以做到这一点,但会发出警告:将列表喜欢传递给.loc或[]以及任何缺少的标签将来会引发KeyError,您可以使用.reindex()作为替代

df.loc[:,["val1", "val2", "val4"]]

答案 1 :(得分:0)

这样的事情应该让你开始:

import pandas as pd
import numpy as np

df = pd.DataFrame([[1, 1, 1], [2, 2, 2], [3, 3, 3]], columns=['val1', 'val2', 'val3'])

def check_columns(df, values):

    temp = pd.DataFrame()
    for i in values:
        try:
            temp[i] = df[i]
        except:
            temp[i] = np.nan
    return temp

print(check_columns(df, ['val1', 'val2', 'val3']))
print(check_columns(df, ['val1', 'val2', 'val4']))

给出:

   val1  val2  val3
0     1     1     1
1     2     2     2
2     3     3     3
   val1  val2  val4
0     1     1   NaN
1     2     2   NaN
2     3     3   NaN