计算现有的新列

时间:2019-10-28 16:58:20

标签: python excel pandas calculation

根据公司的不同,我拥有具有30-35列和50-1500行数据的excel文件。有问题的列如下;使用了剩余退款。这三列由其他列的计算组成。

USED是GAL的每一行的总和,因此excel计算如下:以= W2开始,然后下一行是W2 + W3,然后是W3 + W4,依此类推

剩余已分配使用

退款是GAL * CREDIT

这样的事情是否可能,目前,我正在excel中进行所有计算,这很耗时,经过一番研究,我发现编写一些代码来自动化它会更容易。 感谢任何帮助,即使只是一栏的计算

我一直在网上寻找一些想法,认为熊猫是最好的选择,但是如果有其他建议的话,我会敞开心

import pandas as pd
filename = home/itdept/Documents/BestWines.xlsx
df = pd.read_excel(filename)
df['Refund'] = df['QUANTITY IN GAL']*df['CBMA Credit']
df.head(5)
df.to_excel("path to save")

这是我在第一列中想到的:退款,我不确定如何/是否可以将所有其他列也合并到代码中

2 个答案:

答案 0 :(得分:1)

"""importing packages to be used in our code"""
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile


"""importing excel content to df DataFrame"""
df = pd.read_excel('sflowone.xlsx', sheetname='Sheet1')

""" we will use LIST for updating (USED)coloumn"""
newlist = []        # created empty list

x=int(0)            # created a variable which will take all values of "GAL"
for value in df["GAL"]: # FOR LOOP will run for every value in "GAL" and takes data in "value"
    x = value + x       # add all earliear entries of "GAL"
    newlist.append(x)   # here we append the new values of x inside empty list
df.drop("USED",axis=1,inplace= True)  # deleted the "USED" column if exist before updating.
df.insert(3,"USED",newlist)     # inserted updated "USED" column with newlist in index number "3" 


"""Updating (REMAIN) and  (REFUND)"""
df ["REMAIN"]= df["ASSIGNED"]- df["USED"]
df ["REFUND"]= df["GAL"]* df["CREDIT"]

""" Visualising first 5 entries"""
df.head(5)
""" saving to Excel sheet """
df.to_excel("sflowfinal.xlsx")


"""CODE IS TESTED AND RUNNING, for query please reply"""

答案 1 :(得分:1)

考虑Series.cumsum的累计金额:

df['USED'] = df['GAL'].cumsum()

从那里,任何基本的算术运算(例如减法和乘法)都可以直接在列上运行:

# SUBTRACTION
df['REMAIN'] = df['ASSIGNED'] - df['USED']

# MULTIPLICATION
df['REFUND'] = df['QUANTITY IN GAL'] * df['CBMA Credit']

或它们的功能形式submul(以及其他类似的运算符):

# SUBTRACTION
df['REMAIN'] = df['ASSIGNED'].sub(df['USED'])

# MULTIPLICATION
df['REFUND'] = df['QUANTITY IN GAL'].mul(df['CBMA Credit'])

总而言之,请考虑assign作为一个简洁的陈述:

import pandas as pd

filename = "home/itdept/Documents/BestWines.xlsx"
df = (pd.read_excel(filename)
        .assign(USED = lambda x: x['GAL'].cumsum(),
                REMAIN = lambda x: x['ASSIGNED'].sub(x['USED']),
                REFUND = lambda x: x['QUANTITY IN GAL'].mul(x['CBMA Credit'])
               )
     )

df.head(5)
df.to_excel("path to save")