根据公司的不同,我拥有具有30-35列和50-1500行数据的excel文件。有问题的列如下;使用了剩余退款。这三列由其他列的计算组成。
USED是GAL的每一行的总和,因此excel计算如下:以= W2开始,然后下一行是W2 + W3,然后是W3 + W4,依此类推
剩余已分配使用
退款是GAL * CREDIT
这样的事情是否可能,目前,我正在excel中进行所有计算,这很耗时,经过一番研究,我发现编写一些代码来自动化它会更容易。 感谢任何帮助,即使只是一栏的计算
我一直在网上寻找一些想法,认为熊猫是最好的选择,但是如果有其他建议的话,我会敞开心
import pandas as pd
filename = home/itdept/Documents/BestWines.xlsx
df = pd.read_excel(filename)
df['Refund'] = df['QUANTITY IN GAL']*df['CBMA Credit']
df.head(5)
df.to_excel("path to save")
这是我在第一列中想到的:退款,我不确定如何/是否可以将所有其他列也合并到代码中
答案 0 :(得分:1)
"""importing packages to be used in our code"""
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
"""importing excel content to df DataFrame"""
df = pd.read_excel('sflowone.xlsx', sheetname='Sheet1')
""" we will use LIST for updating (USED)coloumn"""
newlist = [] # created empty list
x=int(0) # created a variable which will take all values of "GAL"
for value in df["GAL"]: # FOR LOOP will run for every value in "GAL" and takes data in "value"
x = value + x # add all earliear entries of "GAL"
newlist.append(x) # here we append the new values of x inside empty list
df.drop("USED",axis=1,inplace= True) # deleted the "USED" column if exist before updating.
df.insert(3,"USED",newlist) # inserted updated "USED" column with newlist in index number "3"
"""Updating (REMAIN) and (REFUND)"""
df ["REMAIN"]= df["ASSIGNED"]- df["USED"]
df ["REFUND"]= df["GAL"]* df["CREDIT"]
""" Visualising first 5 entries"""
df.head(5)
""" saving to Excel sheet """
df.to_excel("sflowfinal.xlsx")
"""CODE IS TESTED AND RUNNING, for query please reply"""
答案 1 :(得分:1)
考虑Series.cumsum
的累计金额:
df['USED'] = df['GAL'].cumsum()
从那里,任何基本的算术运算(例如减法和乘法)都可以直接在列上运行:
# SUBTRACTION
df['REMAIN'] = df['ASSIGNED'] - df['USED']
# MULTIPLICATION
df['REFUND'] = df['QUANTITY IN GAL'] * df['CBMA Credit']
# SUBTRACTION
df['REMAIN'] = df['ASSIGNED'].sub(df['USED'])
# MULTIPLICATION
df['REFUND'] = df['QUANTITY IN GAL'].mul(df['CBMA Credit'])
总而言之,请考虑assign
作为一个简洁的陈述:
import pandas as pd
filename = "home/itdept/Documents/BestWines.xlsx"
df = (pd.read_excel(filename)
.assign(USED = lambda x: x['GAL'].cumsum(),
REMAIN = lambda x: x['ASSIGNED'].sub(x['USED']),
REFUND = lambda x: x['QUANTITY IN GAL'].mul(x['CBMA Credit'])
)
)
df.head(5)
df.to_excel("path to save")