用熊猫汇总列和行

时间:2020-05-28 16:45:36

标签: pandas

我只是真正开始使用Python / Pandas。我在工作中有时间学习,这是很多开始/停止的过程。 我一直在尝试使用Pandas并取得一些进展,但是我遇到了一些麻烦,也许我做的某些事情比做起来容易。 我目前正在从Sql Server加载4个数据帧。然后我运行从不同表中提取的第二组sql语句,但附加到原始4个数据帧。

所以我最终为每个数据帧添加了2行。我需要将它们组合起来,所以我要计算总计,然后将此行也添加到每个数据框。所以我现在有3行。第三行是我真正需要在Excel中打印的内容,我需要在第三行进行求和。
我还需要对数据帧之间的列求和(即df1,df2,df3,df4的列1)

我根据管道分组将它们分解为多个数据帧。

下面是我正在做的基础。

enter code here
sql = """select 
sum(case When diameter is Null and Material = 2 then measuredlength end) as 'CSU2' ,
sum(case When diameter <= 2 and Material = 2 then measuredlength end) as 'CS2_2' ,  
sum(case When (diameter > 2 and  Diameter <= 4) and Material = 2 then measuredlength end) as 'CS24_2' 
,
sum(case When (diameter > 4 and  Diameter <= 8) and Material = 2 then measuredlength end) as 'CS48_2' 
, 
sum(case When (diameter > 8 and  Diameter <= 12) and Material = 2 then measuredlength end) as 
'CS812_2' ,   
sum(case When diameter > 12 and Material = 2 then measuredlength end) as 'CS12_2'    
FROM [GISOwner].[MAINS] 
where maintype <> 'Transmission'  and
(inservicedate < '2019-12-31'  or inservicedate is Null ) """


df1 = pd.read_sql(sql, conn)

sql2 = """select        
sum(case When cast(Nominaldiameter as Decimal) <= 2 and upper(Material) = 'COATED STEEL' 
then measuredlength end) as 'CS2_2',
sum(case When cast(Nominaldiameter as Decimal) > 2 and cast(NominalDiameter as Decimal) <= 4
and upper(Material) = 'COATED STEEL'  then measuredlength end) as 'CS24_2',   
sum(case When cast(Nominaldiameter as Decimal) > 4 and cast(NominalDiameter as Decimal) <= 8
and upper(Material) = 'COATED STEEL'  then measuredlength end) as 'CS48_2',   
sum(case When cast(Nominaldiameter as Decimal) > 8 and cast(NominalDiameter as Decimal) <= 12
and upper(Material) = 'COATED STEEL'  then measuredlength end) as 'CS812_2',
sum(case When cast(Nominaldiameter as Decimal) > 12  
and upper(Material) = 'COATED STEEL'  then measuredlength end) as 'CS12_2'   
FROM[GISOwner].[ABANDONEDGASPIPE_VW] 
where legacyID is NULL  and
retireddate > '2019-12-31' """


df1 = df1.append(pd.read_sql(sql2, conn))
df1 = df1.fillna(0)
df1 = df1.div(5280)

sumsCS = df1.select_dtypes(np.number).sum().rename("total")   
df1.loc["total"] = sumsCS

对于不同类型的材料,同一组代码要重复3次以上。

我正在努力如何求和数据帧之间的列以及求和数据帧的特定行。 此外,我想知道是否有一种方法可以将第二个查询追加到与第一个查询相同的行,而不是创建一个新行,因此需要另一次计算才能获得总计。

希望如此。

任何指导表示赞赏。

迈克

0 个答案:

没有答案