删除两个Pandas系列中包含零的整个行

时间:2014-05-05 12:26:54

标签: python-2.7 pandas

我有一个函数,用于绘制Pandas DataFrame中两列的日志。因为这样的零会导致错误并需要删除。目前,函数的输入是DataFrame的两列。有没有办法删除任何包含零的行?例如,等效版本的df = df [df.ColA!= 0]

def logscatfit(x,y,title):
    xvals2 = np.arange(-2,6,1)
    a = np.log(x) #These are what I want to remove the zeros from
    b = np.log(y)
    plt.scatter(a, b, c='g', marker='x', s=35)
    slope, intercept, r_value, p_value, std_err = stats.linregress(a,b)
    plt.plot(xvals2, (xvals2*slope + intercept), color='red')
    plt.title(title)
    plt.show()
    print "Slope is:",slope, ". Intercept is:",intercept,". R-value is:",r_value,". P-value is:",p_value,". Std_err is:",std_err

ab中无法想到删除零的方法,但保持它们的长度相同,以便我可以绘制散点图。我唯一的选择是重写函数以获取DataFrame,然后使用df1 = df[df.ColA != 0]然后df2 = df1[df1.ColB != 0]删除零?

3 个答案:

答案 0 :(得分:2)

根据我的理解,您需要删除 (和/或)xy为零的行。

一种简单的方法是

keepThese = (x > 0) & (y > 0)
a = x[keepThese]
b = y[keepThese]

然后继续使用您的代码。

答案 1 :(得分:1)

我喜欢FooBar的简单回答。更通用的方法是将数据帧传递给您的函数并使用.any()方法。

def logscatfit(df,x_col_name,y_col_name,title):
    two_cols = df[[x_col_name,y_col_name]]
    mask = two_cols.apply(lambda x: ( x==0 ).any(), axis = 1)
    df_to_use = df[mask]
    x = df_to_use[x_col_name]
    y = df_to_use[y_col_name]

    #your code
    a = n.log(x)
    etc

答案 2 :(得分:0)

FooBar的答案插入到您的函数中会给出:

def logscatfit(x,y,title):
    xvals2 = np.arange(-2,6,1)
    keepThese = (x > 0) & (y > 0)
    a = x[keepThese]
    b = y[keepTheese]        
    a = np.log(a)
    b = np.log(b)
    plt.scatter(a, b, c='g', marker='x', s=35)
    slope, intercept, r_value, p_value, std_err = stats.linregress(a,b)
    plt.plot(xvals2, (xvals2*slope + intercept), color='red')
    plt.title(title)
    plt.show()
    print "Slope is:",slope, ". Intercept is:",intercept,". R-value is:",r_value,". P-value is:",p_value,". Std_err is:",std_err