Question

如果它们不是NaN，我希望将两个列连接在一起，如下所示：

if(df[pd.notnull([df["Col1"]])] and df[pd.notnull([df["Col2"]])]):
    df["Col3"] = df["Col1"] + df["Col2"]

如果这两列都不是NULL / NaN，则将其他两个字符串放在一起并将其放入第3列。

我该怎么做呢？ pd.notnull并不像我期望的那样。

我希望它的表现如下：

"First Name" "Last Name" "Full Name"
 a            b           a b
 a1           b1          a1 b1
 a2           b2          a2 b2

在格式化之前，它会在“全名”列中包含NaN。

这是格式化之前数据的外观：

"First Name" "Last Name" "Full Name" 
a            b            NaN
a1           b1           NaN
a2           b2           NaN
NaN          NaN          a3 b3

Answer 1

使用private SimpleStringProperty testClass; private SimpleStringProperty testMethod; private SimpleStringProperty testDesc; private SimpleBooleanProperty runMode; public TestSuite(String testClass, String testMethod, String testDesc, boolean runMode) { this.testClass = new SimpleStringProperty(testClass); this.testMethod = new SimpleStringProperty(testMethod); this.testDesc = new SimpleStringProperty(testDesc); this.runMode = new SimpleBooleanProperty(runMode); } public String getTestClass() { return testClass.get(); } public String getTestMethod() { return testMethod.get(); } public String getTestDesc() { return testDesc.get(); } public boolean getRunMode() { return runMode.get(); }设置.loc

Col3

详细

In [383]: df
Out[383]:
  Col1 Col2
0    a    h
1  NaN    i
2    c    j
3  NaN  NaN
4  NaN    l
5    f    m
6    g  NaN

In [384]: df.loc[df[['Col1', 'Col2']].notnull().all(1), 'Col3'] = df.Col1 + df.Col2

In [385]: df
Out[385]:
  Col1 Col2 Col3
0    a    h   ah
1  NaN    i  NaN
2    c    j   cj
3  NaN  NaN  NaN
4  NaN    l  NaN
5    f    m   fm
6    g  NaN  NaN

Answer 2

val conf = new SparkConf().setAppName("MyApp")
val master = new SparkContext(conf).master

if (master == "local[*]") // running locally
{
  conf.set(...)
  conf.set(...)
}
else // running on a cluster
{
  conf.set(...)
  conf.set(...)
}

val sc = new SparkContext(conf)

df['Full Name'].fillna(df['First Name'].str.cat(df['Last Name'], sep=' ')) 0 a b 1 a1 b1 2 a2 b2 3 a3 b3 Name: Full Name, dtype: objec

pd.DataFrame.update

制作副本

df.update(
    df['Full Name'].fillna(df['First Name'].str.cat(df['Last Name'], sep=' ')
)

df

  First Name Last Name Full Name
0          a         b       a b
1         a1        b1     a1 b1
2         a2        b2     a2 b2
3        NaN       NaN     a3 b3

如果另一列不是NaN w /字符串连接，则填写NaN列

2 个答案: