今天,我正试图为我的论文项目解决一个问题,但陷入困境。如果可能,我需要你们的帮助来解决。
import pandas as pd
import numpy as np
excel_file = 'Test.xls'
patient = pd.read_excel(excel_file)
max_all = pd.read_excel(excel_file).select_dtypes(include[np.number]).max().max()
min_all = pd.read_excel(excel_file).select_dtypes(include[np.number]).min().min()
final = max_all - min_all
value = 0.9/final
#what it does is applies the above formula only in numerical values leaving the alpha-numeric as it is which was important because the alpha-numeric ones were rows name which describes the rows and i wanted them to remain as they are
#for my excel file
def f(x):
return round(x * value,5) if type(x) in [int,float] else x
e=patient.applymap(f)
e.to_excel("Test2.xls")
excel_file2 = 'Training.xls'
patient2 = pd.read_excel(excel_file2)
max_all2 = pd.read_excel(excel_file2).select_dtypes(include[np.number]).max().max()
min_all2 = pd.read_excel(excel_file2).select_dtypes(include[np.number]).min().min()
final2 = max_all2 - min_all2
value2 = 0.9/final2
def f(x):
return round(x * value2,5) if type(x) in [int,float] else x
e1 = patient2.applymap(f)
e1.to_excel("Training02.xls")
df1 = pd.read_excel('Test2.xls') # get output excel files as input
df2 = pd.read_excel('Training02.xls')
info = df2.shape
totalRow = info[0]
print(totalRow)
m=[]
tmp = 0
for r in range (0,totalRow):
lst = df2.iloc[r,:]
x=df1.iloc[r,1]
cnt = 0
l = []
for i in lst:
if cnt==0:
l.append(i)
cnt=1
continue
elif (i-x)==0:
l.append(1)# condition for my program to work on excel files
elif abs(i-x) > 0.2:
l.append(0)
else:
l.append(i)
m.append(l)
df3=pd.DataFrame(m)
go=df3.to_excel("output.xls")
因此,此代码要做的是将两个Excel文件作为输入,并根据条件比较这些输入文件的输出。这是包含大约5K行和100列的输入Excel文件数据:
input1
input2
输出1或新的输入2用于比较
output2或新的input1进行比较
比较后的输出
现在问题出在我们的原始输入数据中,用户名采用GSMXXXXX格式,但在比较时我们排除了它们。我希望它们出现在输出中,并且不要在第一行中也显示索引值。因此,当我对它们进行转置时,我只会看到行中的用户名,而不是列中的用户名,而看到的是列中的功能,而不是行。
Here是Google云端硬盘中的数据文件。由于实际文件的大小,我将实际文件的总行数和列数减至最少,因为要素的大小大约为5k行,患者为55列。