如果另一列不是NaN w /字符串连接,则填写NaN列

时间:2017-09-15 17:00:25

标签: pandas dataframe

如果它们不是NaN,我希望将两个列连接在一起,如下所示:

if(df[pd.notnull([df["Col1"]])] and df[pd.notnull([df["Col2"]])]):
    df["Col3"] = df["Col1"] + df["Col2"]

如果这两列都不是NULL / NaN,则将其他两个字符串放在一起并将其放入第3列。

我该怎么做呢? pd.notnull并不像我期望的那样。

我希望它的表现如下:

"First Name" "Last Name" "Full Name"
 a            b           a b
 a1           b1          a1 b1
 a2           b2          a2 b2

在格式化之前,它会在“全名”列中包含NaN。

这是格式化之前数据的外观:

"First Name" "Last Name" "Full Name" 
a            b            NaN
a1           b1           NaN
a2           b2           NaN
NaN          NaN          a3 b3

2 个答案:

答案 0 :(得分:3)

使用private SimpleStringProperty testClass; private SimpleStringProperty testMethod; private SimpleStringProperty testDesc; private SimpleBooleanProperty runMode; public TestSuite(String testClass, String testMethod, String testDesc, boolean runMode) { this.testClass = new SimpleStringProperty(testClass); this.testMethod = new SimpleStringProperty(testMethod); this.testDesc = new SimpleStringProperty(testDesc); this.runMode = new SimpleBooleanProperty(runMode); } public String getTestClass() { return testClass.get(); } public String getTestMethod() { return testMethod.get(); } public String getTestDesc() { return testDesc.get(); } public boolean getRunMode() { return runMode.get(); } 设置.loc

Col3

详细

In [383]: df
Out[383]:
  Col1 Col2
0    a    h
1  NaN    i
2    c    j
3  NaN  NaN
4  NaN    l
5    f    m
6    g  NaN

In [384]: df.loc[df[['Col1', 'Col2']].notnull().all(1), 'Col3'] = df.Col1 + df.Col2

In [385]: df
Out[385]:
  Col1 Col2 Col3
0    a    h   ah
1  NaN    i  NaN
2    c    j   cj
3  NaN  NaN  NaN
4  NaN    l  NaN
5    f    m   fm
6    g  NaN  NaN

答案 1 :(得分:2)

val conf = new SparkConf().setAppName("MyApp")
val master = new SparkContext(conf).master

if (master == "local[*]") // running locally
{
  conf.set(...)
  conf.set(...)
}
else // running on a cluster
{
  conf.set(...)
  conf.set(...)
}

val sc = new SparkContext(conf)

df['Full Name'].fillna(df['First Name'].str.cat(df['Last Name'], sep=' ')) 0 a b 1 a1 b1 2 a2 b2 3 a3 b3 Name: Full Name, dtype: objec

pd.DataFrame.update

制作副本

df.update(
    df['Full Name'].fillna(df['First Name'].str.cat(df['Last Name'], sep=' ')
)

df

  First Name Last Name Full Name
0          a         b       a b
1         a1        b1     a1 b1
2         a2        b2     a2 b2
3        NaN       NaN     a3 b3