熊猫数据框df.at KeyError

时间:2019-06-12 10:12:15

标签: python pandas

我有两个数据框,我使用的Excel有三张纸。 SQL / Bio / Gen。在Bio和Gen中,IDs (Labornumber) 191****** and 192******是两种不同的类型。这两个框架都有SQL工作表中的布局。

SQL:

OrderID     Creation Date   User ID Days in Lab Gender  Sample Date ID of Sample Card   System Sample ID    OrderStatus         Sample Received Sample Rejected Genetics    Closed Enzyme testing only  Closed Enzyme + genetic testing Closed genetic testing only Closed biomarker testing only   Closed Enzyme / Lyso pending    Closed Enzyme + Lyso testing    Department  City    Country     Signed ICF  ICF Agreement       Pompe   Status biochemical test Result biochemical test Status genetic test Result genetic test Gaucher Status biochemical test Result biochemical test Status genetic test Result genetic test Niemann-Pick    Status biochemical test Result biochemical test Status genetic test Result genetic test Fabry   Is a family mutation known  Status biochemical test Result biochemical test Status biomarker test   Result biomarker test   Status genetic test Result genetic test MPS I   Status biochemical test Result biochemical test Status genetic test Result genetic test MPS II  Status biochemical test Result biochemical test MPS VI  Status biochemical test Result biochemical test Lyso-GL-3: alpha-galactosidase  Status biochemical test Result biochemical test
329819      2017-01-02      954569  4           Gender  2017-12-05  11112345            329819              REPORT_AVAILABLE    2017-01-09      NO              2017-01-23  2017-01-23                  2017-01-23                      2017-01-23                  2017-01-23                      2017-01-23                      2017-01-23                      Department  City    Country     Y           disease diagnostic  0       0                       0                       0                   0                   0       0                       0                       0                   0                   0               0                       0                       0                   0                   0       0                           0                       0                       0                       0                       0                   0                   0       0                       0                       0                   0                   0       0                       0                       0       0                       0                       0                               0                       0

简历:

Barcode     Eingangsdatum       Befunddatum Biochemie   Befunddatum Biochemie2  Befunddatum Lyso-GL-1   Biochemie Ergebnis  Biochemie Ergebnis2 Ergebnis Lyso-GL-1  Biochemistry report Diagnosis   Diagnosis_2 Labornummer Age     Sex Country
91429483    2017-12-05 00:00:00 09.01.2017                                                              3,191109194                                                 normal              MPS Panel               192000200   14      Sex Country

Gen:

Date of Genetic Report  Barcode     Date for starting genetic testing   klin Diagnose   klin Diagnose2  Alter   Sex Labornummer Country     Biochemie Ergebnis  Biochemie Ergebnis2 Lyso-GL-1   Gene-ID Mutation AS 1   Mutation AS 2   Mutation Codon1 Mutation Codon2 Mutationsart    Mutationsart 1  Mutationsart 2  Family
2017-01-25 00:00:00     51425126    2017-01-11 00:00:00                 Fabry           Lyso-GL3        30      Sex 192000250   Country     2,432291667         3,794916415                     BAC                                                                     normal

Data2应该仅包含192个数字,而data1应该具有191个数字。 对于data2,所有192个数字都在SQL工作表中。在gen中有191和192的数字,所以我用

对其进行过滤
regex = re.compile(r"(?!191\d*)")

for index, row in self.data1.iterrows():
    if(re.match(regex, str(row["Labornummer"]))):
        self.data1.drop(index, inplace=True)

这很好用,我的data1数据框中有全部191个。 之后,我将{1}}

与data1和我从gen和bio获得的所有信息合并

data2相同,但我将其与条形码/系统样本ID合并

那行得通,每一帧都有我需要更改其单元格的所有数据。

有问题的Data1有7387行和96列。我遍历每一行并保存以后要更改的索引。这样做工作正常,但是如果我想使用保存在列表中的索引来更改特定单元格中的值,则会收到错误Labnumber

我以后要更改的保存索引的功能:

KeyError: 6817

用于更改特定单元格中的值的函数:

for index, row in self.data1.iterrows():
        if row["Diagnosis"] == "Disease":
            if self.getCutOffResult(row["Result"], row["Diagnosis"]):
                b_Disease_pos_1.append(index)

两个函数都更改了data1和data2,但仅data1仅对data2没问题,而从我的角度来看data1应该没有问题,因为索引存在,我在dataframe和excel中都看到了我用大熊猫出口。我遍历每一行,所以它必须存在。

数据框:

for index in container:   
                    if self.isSecondRequest(index):
                        self.data1.at[index, row] = 2
                    else:
                        self.data1.at[index, row] = 1

0 个答案:

没有答案