pandas如何编写查询

时间:2018-05-03 00:53:46

标签: sql python-2.7 pandas jupyter-notebook

基于此 pandas get column average/mean

我可以像这样创建一个简单的计算字段: 我的查询

df = pd.read_sql("select range_start, range_end  from "+table+" group by  range_start, range_end", conn)

创建此表:

Start   Stop
4385159 4499467
4175786 4352309
342426  354137
5591040 5600392

我想要做的是注入一个具有差异的列,我可以这样做:

df2['Diff'] = df2['Stop'] - df2['Start']

现在我的表看起来像这样:

Start   End      Diff
4385159 4499467 114308
4175786 4352309 176523
342426  354137  11711

我的问题是如何编写将返回结果的查询:

df = pd.read_sql("select Diff  from "+table+" where Diff < Xnumber group by  Diff", conn)

我想我需要在jupyter(pandas)中查询查询。做这样的事情:

df = pd.read_sql("select (df2['Stop'] - df2['Start']) as df2['Diff'] where (df2['Stop'] - df2['Start']) < Xnumber group by (df2['Stop'] - df2['Start'])",conn)

^没有用但你明白了

2 个答案:

答案 0 :(得分:0)

我可能会遗漏一些东西,但您是否可以直接在pandas中创建新列,而不进行任何查询?

df['Diff'] = df2['Stop'] - df2['Start']

答案 1 :(得分:0)

知道了:

df6 = pd.read_sql("select (Start - Stop) as Diff from "+table+" where <condition>",conn)