我有两个数据框,我使用的Excel有三张纸。 SQL / Bio / Gen
。在Bio和Gen中,IDs (Labornumber) 191****** and 192******
是两种不同的类型。这两个框架都有SQL工作表中的布局。
SQL:
OrderID Creation Date User ID Days in Lab Gender Sample Date ID of Sample Card System Sample ID OrderStatus Sample Received Sample Rejected Genetics Closed Enzyme testing only Closed Enzyme + genetic testing Closed genetic testing only Closed biomarker testing only Closed Enzyme / Lyso pending Closed Enzyme + Lyso testing Department City Country Signed ICF ICF Agreement Pompe Status biochemical test Result biochemical test Status genetic test Result genetic test Gaucher Status biochemical test Result biochemical test Status genetic test Result genetic test Niemann-Pick Status biochemical test Result biochemical test Status genetic test Result genetic test Fabry Is a family mutation known Status biochemical test Result biochemical test Status biomarker test Result biomarker test Status genetic test Result genetic test MPS I Status biochemical test Result biochemical test Status genetic test Result genetic test MPS II Status biochemical test Result biochemical test MPS VI Status biochemical test Result biochemical test Lyso-GL-3: alpha-galactosidase Status biochemical test Result biochemical test
329819 2017-01-02 954569 4 Gender 2017-12-05 11112345 329819 REPORT_AVAILABLE 2017-01-09 NO 2017-01-23 2017-01-23 2017-01-23 2017-01-23 2017-01-23 2017-01-23 2017-01-23 Department City Country Y disease diagnostic 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
简历:
Barcode Eingangsdatum Befunddatum Biochemie Befunddatum Biochemie2 Befunddatum Lyso-GL-1 Biochemie Ergebnis Biochemie Ergebnis2 Ergebnis Lyso-GL-1 Biochemistry report Diagnosis Diagnosis_2 Labornummer Age Sex Country
91429483 2017-12-05 00:00:00 09.01.2017 3,191109194 normal MPS Panel 192000200 14 Sex Country
Gen:
Date of Genetic Report Barcode Date for starting genetic testing klin Diagnose klin Diagnose2 Alter Sex Labornummer Country Biochemie Ergebnis Biochemie Ergebnis2 Lyso-GL-1 Gene-ID Mutation AS 1 Mutation AS 2 Mutation Codon1 Mutation Codon2 Mutationsart Mutationsart 1 Mutationsart 2 Family
2017-01-25 00:00:00 51425126 2017-01-11 00:00:00 Fabry Lyso-GL3 30 Sex 192000250 Country 2,432291667 3,794916415 BAC normal
Data2应该仅包含192个数字,而data1应该具有191个数字。 对于data2,所有192个数字都在SQL工作表中。在gen中有191和192的数字,所以我用
对其进行过滤regex = re.compile(r"(?!191\d*)")
for index, row in self.data1.iterrows():
if(re.match(regex, str(row["Labornummer"]))):
self.data1.drop(index, inplace=True)
这很好用,我的data1数据框中有全部191个。 之后,我将{1}}
与data1和我从gen和bio获得的所有信息合并data2相同,但我将其与条形码/系统样本ID合并
那行得通,每一帧都有我需要更改其单元格的所有数据。
有问题的Data1有7387行和96列。我遍历每一行并保存以后要更改的索引。这样做工作正常,但是如果我想使用保存在列表中的索引来更改特定单元格中的值,则会收到错误Labnumber
我以后要更改的保存索引的功能:
KeyError: 6817
用于更改特定单元格中的值的函数:
for index, row in self.data1.iterrows():
if row["Diagnosis"] == "Disease":
if self.getCutOffResult(row["Result"], row["Diagnosis"]):
b_Disease_pos_1.append(index)
两个函数都更改了data1和data2,但仅data1仅对data2没问题,而从我的角度来看data1应该没有问题,因为索引存在,我在dataframe和excel中都看到了我用大熊猫出口。我遍历每一行,所以它必须存在。
数据框:
for index in container:
if self.isSecondRequest(index):
self.data1.at[index, row] = 2
else:
self.data1.at[index, row] = 1