Python代码和Excel文件上的一些计算

时间:2018-08-05 19:34:41

标签: python excel pandas data-science

今天,我正试图为我的论文项目解决一个问题,但陷入困境。如果可能,我需要你们的帮助来解决。

import pandas as pd
import numpy as np

excel_file = 'Test.xls'
patient = pd.read_excel(excel_file)

max_all = pd.read_excel(excel_file).select_dtypes(include[np.number]).max().max()

min_all = pd.read_excel(excel_file).select_dtypes(include[np.number]).min().min()

final = max_all - min_all

value = 0.9/final

#what it does is applies the above formula only in numerical values leaving the alpha-numeric as it is which was important because the alpha-numeric ones were rows name which describes the rows and i wanted them to remain as they are
#for my excel file
def f(x):
    return round(x * value,5)  if type(x) in [int,float] else x
e=patient.applymap(f)

e.to_excel("Test2.xls")

excel_file2 = 'Training.xls' 
patient2 = pd.read_excel(excel_file2)

max_all2 = pd.read_excel(excel_file2).select_dtypes(include[np.number]).max().max()

min_all2 = pd.read_excel(excel_file2).select_dtypes(include[np.number]).min().min()

final2 = max_all2 - min_all2

value2 = 0.9/final2

def f(x):
    return round(x * value2,5)  if type(x) in [int,float] else x
e1 = patient2.applymap(f)


e1.to_excel("Training02.xls")

df1 = pd.read_excel('Test2.xls') # get output excel files as input
df2 = pd.read_excel('Training02.xls') 

info = df2.shape
totalRow = info[0]

print(totalRow)

m=[]
tmp = 0
for r in range (0,totalRow):
    lst = df2.iloc[r,:]
    x=df1.iloc[r,1]

    cnt = 0
    l = []

    for i in lst:
        if cnt==0:
            l.append(i)
            cnt=1
            continue
        elif (i-x)==0:
           l.append(1)# condition for my program to work on excel files 

        elif abs(i-x) > 0.2:
            l.append(0)
        else:
            l.append(i)
    m.append(l)

df3=pd.DataFrame(m)
go=df3.to_excel("output.xls")

因此,此代码要做的是将两个Excel文件作为输入,并根据条件比较这些输入文件的输出。这是包含大约5K行和100列的输入Excel文件数据:

input1
input1

input2
input2

输出1或新的输入2用于比较
output1 or new input2 for comparison

output2或新的input1进行比较
output2 or new input1 for comparison

比较后的输出
Output after comparing

现在问题出在我们的原始输入数据中,用户名采用GSMXXXXX格式,但在比较时我们排除了它们。我希望它们出现在输出中,并且不要在第一行中也显示索引值。因此,当我对它们进行转置时,我只会看到行中的用户名,而不是列中的用户名,而看到的是列中的功能,而不是行。

Here是Google云端硬盘中的数据文件。由于实际文件的大小,我将实际文件的总行数和列数减至最少,因为要素的大小大约为5k行,患者为55列。

0 个答案:

没有答案