是什么通过简单的函数调用导致此KeyError?

时间:2018-07-19 04:15:24

标签: python pandas function numpy dataframe

以下错误是由以下代码引起的。我已经读过KeyError: 0 is due to a dictionary file lacking an entry,但是我仍然真的不知道字典文件是什么,或者我的代码如何访问它:我只是在尝试访问数据帧中的数据。显然问题在于,数据帧VolValues的子集使用的索引始于23000,而我正尝试使用索引'0'对其进行切片,因为我认为这是python的“第一个元素”语法。

您能否告诉我代码有什么问题以及如何解决?

runfile('/Users/daniel/Documents/programming/RectumD2Metrics.py', wdir='/Users/daniel/Documents/programming')
Traceback (most recent call last):

  File "<ipython-input-2-d170dca123d7>", line 1, in <module>
    runfile('/Users/daniel/Documents/programming/RectumD2Metrics.py', wdir='/Users/daniel/Documents/programming')

  File "/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
    execfile(filename, namespace)

  File "/anaconda3/lib/python3.6/site-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/Users/daniel/Documents/programming/RectumD2Metrics.py", line 37, in <module>
    D2Planned = interpD2('planned',df)

  File "/Users/daniel/Documents/programming/RectumD2Metrics.py", line 30, in interpD2
    if (VolValues[loop] > 2) and (VolValues[loop+1] < 2):

  File "/anaconda3/lib/python3.6/site-packages/pandas/core/series.py", line 766, in __getitem__
    result = self.index.get_value(self, key)

  File "/anaconda3/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 3103, in get_value
    tz=getattr(series.dtype, 'tz', None))

  File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.get_value

  File "pandas/_libs/index.pyx", line 114, in pandas._libs.index.IndexEngine.get_value

  File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc

  File "pandas/_libs/hashtable_class_helper.pxi", line 958, in pandas._libs.hashtable.Int64HashTable.get_item

  File "pandas/_libs/hashtable_class_helper.pxi", line 964, in pandas._libs.hashtable.Int64HashTable.get_item

KeyError: 0

代码:

# Import Rectum DVH
import pandas
import numpy
df = pandas.read_csv('/Users/daniel/Documents/data/DVH/RectumData.csv',
                 delimiter=',',header=0)
# Calculate D_2%, defined by ICRU 78 as "the greatest dose which all but
# 2 percent of a [volume of interest] receives." aka D_{near-max}

def interpD2(disttype,df):
# Loop through all patients' plans.
    Dose2Results = numpy.zeros(40)
    for num in range(0,40): 
# We know a priori that there is no DVH data with Volume = 2. Hence we look for
# the two columns less than and greater than Volume = 2.
        if disttype == 'planned':
            DoseValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == 'planned')].Dose
            VolValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == 'planned')].Volume
        else:
            DoseValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == 'blurred')].Dose
            VolValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == 'blurred')].Volume
        for loop in range(0,len(VolValues)):
            if (VolValues[loop] > 2) and (VolValues[loop+1] < 2):
                LowerVolumeIndex,UpperVolumeIndex = loop,loop+1
                x0,x1,x2 = 2,VolValues[LowerVolumeIndex],VolValues[UpperVolumeIndex]
                y1,y2 = DoseValues[LowerVolumeIndex],DoseValues[UpperVolumeIndex]
                Dose2Results[num] = y1 - ((x1-x0)/(x2 - x1))*(y2 - y1)
    return Dose2Results
D2Planned,D2Blurred = numpy.zeros(40),numpy.zeros(40)
D2Planned = interpD2('planned',df)
D2Blurred = interpD2('blurred',df)

这篇文章的结尾是导入的CSV文件的摘录。

尝试解决:

  1. 删除传递到函数中的df会导致相同的错误消息。 (最初是因为我认为函数可以访问“全局”变量而已。)

  2. 我试图用零“初始化”变量。

  3. 我使用if块明确解析了该字符串,以尝试解决该错误消息,但仍然没有变化。

  4. 检查the pandas page,发现有可用更新。我已经通过conda install pandas安装了它,但是错误消息仍然没有变化。 (此更新的详细信息如下。)

    The following packages will be downloaded:
    
    package                    |            build
    ---------------------------|-----------------
    certifi-2018.4.16          |           py36_0         142 KB
    conda-4.5.8                |           py36_0         1.0 MB
    ------------------------------------------------------------
                                           Total:         1.2 MB
    
    The following packages will be UPDATED:
    
    certifi: 2018.4.16-py36_0 conda-forge --> 2018.4.16-py36_0
    conda:   4.5.6-py36_0     conda-forge --> 4.5.8-py36_0
    

谢谢您的帮助。

CSV数据摘录,包括第一行标题行;我已跳过行以保持在正文字符数限制内,但是请注意,首先列出了DistributionType,然后列出了StudyID。因此,数字变为1,10,11,...,19,2,20,21,...,其中“模糊”数据位于“计划”数据之前。

StudyID,DistributionType,Organ,Dose,Volume,DoseUnit,VolumeUnit
1,blurred,Rectum,0,100,Gy(RBE),%
1,blurred,Rectum,0.1,78.13818,Gy(RBE),%
1,blurred,Rectum,0.2,75.901,Gy(RBE),%
1,blurred,Rectum,0.3,75.01312,Gy(RBE),%
1,blurred,Rectum,0.4,73.38642,Gy(RBE),%
1,blurred,Rectum,0.5,72.36015,Gy(RBE),%
1,blurred,Rectum,0.6,70.81651,Gy(RBE),%
1,blurred,Rectum,7.3,22.60766,Gy(RBE),%
1,blurred,Rectum,7.4,22.4557,Gy(RBE),%
1,blurred,Rectum,7.5,22.31794,Gy(RBE),%
1,blurred,Rectum,7.6,22.19247,Gy(RBE),%
1,blurred,Rectum,7.7,22.09406,Gy(RBE),%
1,blurred,Rectum,32.2,6.99686,Gy(RBE),%
1,blurred,Rectum,32.3,6.96634,Gy(RBE),%
1,blurred,Rectum,32.4,6.94046,Gy(RBE),%
1,blurred,Rectum,32.5,6.89926,Gy(RBE),%
1,blurred,Rectum,32.6,6.85925,Gy(RBE),%
1,blurred,Rectum,32.7,6.83843,Gy(RBE),%
1,blurred,Rectum,32.8,6.8082,Gy(RBE),%
1,blurred,Rectum,32.9,6.76663,Gy(RBE),%
1,blurred,Rectum,33,6.72788,Gy(RBE),%
1,blurred,Rectum,33.1,6.6771,Gy(RBE),%
1,blurred,Rectum,33.2,6.62313,Gy(RBE),%
1,blurred,Rectum,33.3,6.57601,Gy(RBE),%
1,blurred,Rectum,42.5,2.96622,Gy(RBE),%
1,blurred,Rectum,42.6,2.9242,Gy(RBE),%
1,blurred,Rectum,42.7,2.87604,Gy(RBE),%
1,blurred,Rectum,42.8,2.83046,Gy(RBE),%
1,blurred,Rectum,42.9,2.78527,Gy(RBE),%
1,blurred,Rectum,43,2.73564,Gy(RBE),%
1,blurred,Rectum,43.1,2.7077,Gy(RBE),%
1,blurred,Rectum,43.2,2.69686,Gy(RBE),%
1,blurred,Rectum,43.3,2.6505,Gy(RBE),%
1,blurred,Rectum,43.4,2.62119,Gy(RBE),%
1,blurred,Rectum,43.5,2.59528,Gy(RBE),%
1,blurred,Rectum,43.6,2.55359,Gy(RBE),%
1,blurred,Rectum,43.7,2.50786,Gy(RBE),%
1,blurred,Rectum,43.8,2.46692,Gy(RBE),%
1,blurred,Rectum,43.9,2.40788,Gy(RBE),%
1,blurred,Rectum,44,2.37622,Gy(RBE),%
1,blurred,Rectum,44.1,2.34098,Gy(RBE),%
1,blurred,Rectum,44.2,2.30527,Gy(RBE),%
1,blurred,Rectum,44.3,2.26972,Gy(RBE),%
1,blurred,Rectum,44.4,2.2384,Gy(RBE),%
1,blurred,Rectum,44.5,2.20512,Gy(RBE),%
1,blurred,Rectum,44.6,2.14891,Gy(RBE),%
1,blurred,Rectum,44.7,2.12178,Gy(RBE),%
1,blurred,Rectum,44.8,2.06922,Gy(RBE),%
1,blurred,Rectum,44.9,2.02836,Gy(RBE),%
1,blurred,Rectum,45,1.99259,Gy(RBE),%
1,blurred,Rectum,45.1,1.98118,Gy(RBE),%
1,blurred,Rectum,45.2,1.92938,Gy(RBE),%
1,blurred,Rectum,45.3,1.88315,Gy(RBE),%
1,blurred,Rectum,45.4,1.85419,Gy(RBE),%
1,blurred,Rectum,45.5,1.81149,Gy(RBE),%
1,blurred,Rectum,45.6,1.77154,Gy(RBE),%
1,blurred,Rectum,45.7,1.73287,Gy(RBE),%
1,blurred,Rectum,45.8,1.68749,Gy(RBE),%
1,blurred,Rectum,45.9,1.65961,Gy(RBE),%
1,blurred,Rectum,46,1.62265,Gy(RBE),%
1,blurred,Rectum,46.1,1.61065,Gy(RBE),%
1,blurred,Rectum,46.2,1.56712,Gy(RBE),%
1,blurred,Rectum,46.3,1.50282,Gy(RBE),%
1,blurred,Rectum,46.4,1.45122,Gy(RBE),%
1,blurred,Rectum,46.5,1.42696,Gy(RBE),%
1,blurred,Rectum,46.6,1.38877,Gy(RBE),%
1,blurred,Rectum,46.7,1.35886,Gy(RBE),%
1,blurred,Rectum,46.8,1.34022,Gy(RBE),%
1,blurred,Rectum,46.9,1.29308,Gy(RBE),%
1,blurred,Rectum,56.5,NaN,Gy(RBE),%
1,blurred,Rectum,56.6,NaN,Gy(RBE),%
1,blurred,Rectum,56.7,NaN,Gy(RBE),%
1,blurred,Rectum,56.8,NaN,Gy(RBE),%
1,blurred,Rectum,56.9,NaN,Gy(RBE),%
1,blurred,Rectum,57,NaN,Gy(RBE),%
1,blurred,Rectum,57.1,NaN,Gy(RBE),%
1,blurred,Rectum,57.2,NaN,Gy(RBE),%
1,blurred,Rectum,57.3,NaN,Gy(RBE),%
1,blurred,Rectum,57.4,NaN,Gy(RBE),%
1,blurred,Rectum,57.5,NaN,Gy(RBE),%
1,blurred,Rectum,57.6,NaN,Gy(RBE),%
1,blurred,Rectum,57.7,NaN,Gy(RBE),%
1,blurred,Rectum,57.8,NaN,Gy(RBE),%
1,blurred,Rectum,57.9,NaN,Gy(RBE),%
1,blurred,Rectum,58,NaN,Gy(RBE),%
1,blurred,Rectum,58.1,NaN,Gy(RBE),%
1,blurred,Rectum,58.2,NaN,Gy(RBE),%
9,blurred,Rectum,58.2,NaN,Gy(RBE),%
1,planned,Rectum,0,100,Gy(RBE),%
1,planned,Rectum,0.1,78.01999,Gy(RBE),%
1,planned,Rectum,0.2,76.2245,Gy(RBE),%
1,planned,Rectum,14,19.50103,Gy(RBE),%
1,planned,Rectum,14.1,19.4464,Gy(RBE),%
1,planned,Rectum,14.2,19.39261,Gy(RBE),%
1,planned,Rectum,14.3,19.32695,Gy(RBE),%
1,planned,Rectum,14.4,19.25388,Gy(RBE),%
1,planned,Rectum,14.5,19.17049,Gy(RBE),%
1,planned,Rectum,14.6,19.09786,Gy(RBE),%
1,planned,Rectum,14.7,19.04909,Gy(RBE),%
1,planned,Rectum,14.8,18.98888,Gy(RBE),%
1,planned,Rectum,34,9.50553,Gy(RBE),%
1,planned,Rectum,34.1,9.45993,Gy(RBE),%
1,planned,Rectum,34.2,9.42654,Gy(RBE),%
1,planned,Rectum,34.3,9.39345,Gy(RBE),%
1,planned,Rectum,34.4,9.35196,Gy(RBE),%
1,planned,Rectum,34.5,9.30604,Gy(RBE),%
1,planned,Rectum,34.6,9.27235,Gy(RBE),%
1,planned,Rectum,34.7,9.22334,Gy(RBE),%
1,planned,Rectum,34.8,9.18734,Gy(RBE),%
1,planned,Rectum,34.9,9.14867,Gy(RBE),%
1,planned,Rectum,35,9.11402,Gy(RBE),%
1,planned,Rectum,35.1,9.07618,Gy(RBE),%
1,planned,Rectum,35.2,9.04251,Gy(RBE),%
1,planned,Rectum,35.3,9.00141,Gy(RBE),%
1,planned,Rectum,35.4,8.96289,Gy(RBE),%
1,planned,Rectum,35.5,8.92638,Gy(RBE),%
1,planned,Rectum,35.6,8.89506,Gy(RBE),%
1,planned,Rectum,35.7,8.85644,Gy(RBE),%
1,planned,Rectum,35.8,8.81237,Gy(RBE),%
1,planned,Rectum,35.9,8.76545,Gy(RBE),%
1,planned,Rectum,36,8.73692,Gy(RBE),%
1,planned,Rectum,36.1,8.70149,Gy(RBE),%
1,planned,Rectum,36.2,8.66073,Gy(RBE),%
1,planned,Rectum,36.3,8.61303,Gy(RBE),%
1,planned,Rectum,36.4,8.56549,Gy(RBE),%
1,planned,Rectum,36.5,8.51527,Gy(RBE),%
1,planned,Rectum,36.6,8.47214,Gy(RBE),%
1,planned,Rectum,36.7,8.41663,Gy(RBE),%
1,planned,Rectum,36.8,8.37863,Gy(RBE),%
1,planned,Rectum,36.9,8.35041,Gy(RBE),%
1,planned,Rectum,37,8.31595,Gy(RBE),%
1,planned,Rectum,37.1,8.288,Gy(RBE),%
1,planned,Rectum,37.2,8.26272,Gy(RBE),%
1,planned,Rectum,37.3,8.23171,Gy(RBE),%
1,planned,Rectum,37.4,8.19804,Gy(RBE),%
1,planned,Rectum,37.5,8.1594,Gy(RBE),%
1,planned,Rectum,37.6,8.11729,Gy(RBE),%
1,planned,Rectum,37.7,8.06844,Gy(RBE),%
1,planned,Rectum,37.8,8.02818,Gy(RBE),%
1,planned,Rectum,37.9,7.96257,Gy(RBE),%
1,planned,Rectum,38,7.90243,Gy(RBE),%
1,planned,Rectum,38.1,7.84717,Gy(RBE),%
1,planned,Rectum,38.2,7.80889,Gy(RBE),%
1,planned,Rectum,38.3,7.77623,Gy(RBE),%
1,planned,Rectum,38.4,7.74385,Gy(RBE),%
1,planned,Rectum,38.5,7.71867,Gy(RBE),%
1,planned,Rectum,38.6,7.70076,Gy(RBE),%
1,planned,Rectum,38.7,7.6754,Gy(RBE),%
1,planned,Rectum,38.8,7.64753,Gy(RBE),%
1,planned,Rectum,38.9,7.59392,Gy(RBE),%
1,planned,Rectum,39,7.53856,Gy(RBE),%
1,planned,Rectum,39.1,7.4879,Gy(RBE),%
1,planned,Rectum,39.2,7.4423,Gy(RBE),%
1,planned,Rectum,39.3,7.40429,Gy(RBE),%
1,planned,Rectum,39.4,7.35858,Gy(RBE),%
1,planned,Rectum,39.5,7.30843,Gy(RBE),%
1,planned,Rectum,39.6,7.25325,Gy(RBE),%
1,planned,Rectum,39.7,7.22353,Gy(RBE),%
1,planned,Rectum,39.8,7.19164,Gy(RBE),%
1,planned,Rectum,39.9,7.16789,Gy(RBE),%
1,planned,Rectum,40,7.13184,Gy(RBE),%
1,planned,Rectum,40.1,7.09953,Gy(RBE),%
1,planned,Rectum,40.2,7.04322,Gy(RBE),%
1,planned,Rectum,40.3,6.98051,Gy(RBE),%
1,planned,Rectum,40.4,6.93635,Gy(RBE),%
1,planned,Rectum,40.5,6.90025,Gy(RBE),%
1,planned,Rectum,40.6,6.87001,Gy(RBE),%
1,planned,Rectum,40.7,6.83943,Gy(RBE),%
1,planned,Rectum,40.8,6.81393,Gy(RBE),%
1,planned,Rectum,40.9,6.7731,Gy(RBE),%
1,planned,Rectum,41,6.74696,Gy(RBE),%
1,planned,Rectum,41.1,6.71209,Gy(RBE),%
1,planned,Rectum,41.2,6.64682,Gy(RBE),%
1,planned,Rectum,41.3,6.5857,Gy(RBE),%
1,planned,Rectum,41.4,6.53214,Gy(RBE),%
1,planned,Rectum,41.5,6.48609,Gy(RBE),%
1,planned,Rectum,41.6,6.44336,Gy(RBE),%
1,planned,Rectum,41.7,6.3864,Gy(RBE),%
1,planned,Rectum,41.8,6.33488,Gy(RBE),%
1,planned,Rectum,41.9,6.30537,Gy(RBE),%
1,planned,Rectum,42,6.28613,Gy(RBE),%
1,planned,Rectum,42.1,6.27749,Gy(RBE),%
1,planned,Rectum,42.2,6.26234,Gy(RBE),%
1,planned,Rectum,42.3,6.23083,Gy(RBE),%
1,planned,Rectum,42.4,6.18859,Gy(RBE),%
1,planned,Rectum,42.5,6.12637,Gy(RBE),%
1,planned,Rectum,50,2.63461,Gy(RBE),%
1,planned,Rectum,50.1,2.61684,Gy(RBE),%
1,planned,Rectum,50.2,2.55227,Gy(RBE),%
1,planned,Rectum,50.3,2.48541,Gy(RBE),%
1,planned,Rectum,50.4,2.46586,Gy(RBE),%
1,planned,Rectum,50.5,2.39354,Gy(RBE),%
1,planned,Rectum,50.6,2.33448,Gy(RBE),%
1,planned,Rectum,50.7,2.28168,Gy(RBE),%
1,planned,Rectum,50.8,2.25787,Gy(RBE),%
1,planned,Rectum,50.9,2.19108,Gy(RBE),%
1,planned,Rectum,51,2.12473,Gy(RBE),%
1,planned,Rectum,51.1,2.11024,Gy(RBE),%
1,planned,Rectum,51.2,2.03551,Gy(RBE),%
1,planned,Rectum,51.3,1.98004,Gy(RBE),%
1,planned,Rectum,51.4,1.92951,Gy(RBE),%
1,planned,Rectum,51.5,1.89144,Gy(RBE),%
1,planned,Rectum,51.6,1.82465,Gy(RBE),%
1,planned,Rectum,51.7,1.77709,Gy(RBE),%
1,planned,Rectum,51.8,1.71624,Gy(RBE),%
1,planned,Rectum,51.9,1.65075,Gy(RBE),%
1,planned,Rectum,52,1.61509,Gy(RBE),%
1,planned,Rectum,52.1,1.58169,Gy(RBE),%
1,planned,Rectum,52.2,1.52462,Gy(RBE),%
1,planned,Rectum,52.3,1.44352,Gy(RBE),%
1,planned,Rectum,52.4,1.39243,Gy(RBE),%
1,planned,Rectum,52.5,1.34659,Gy(RBE),%
1,planned,Rectum,52.6,1.33099,Gy(RBE),%
1,planned,Rectum,52.7,1.27496,Gy(RBE),%
1,planned,Rectum,52.8,1.23031,Gy(RBE),%
1,planned,Rectum,52.9,1.15298,Gy(RBE),%
1,planned,Rectum,53,1.0894,Gy(RBE),%
1,planned,Rectum,53.1,1.05667,Gy(RBE),%
1,planned,Rectum,53.2,1.03679,Gy(RBE),%
1,planned,Rectum,53.3,1.00334,Gy(RBE),%
1,planned,Rectum,53.4,0.92593,Gy(RBE),%
1,planned,Rectum,53.5,0.85545,Gy(RBE),%
1,planned,Rectum,53.6,0.81901,Gy(RBE),%
1,planned,Rectum,53.7,0.77809,Gy(RBE),%
1,planned,Rectum,53.8,0.75188,Gy(RBE),%
1,planned,Rectum,57.6,NaN,Gy(RBE),%
1,planned,Rectum,57.7,NaN,Gy(RBE),%
1,planned,Rectum,57.8,NaN,Gy(RBE),%
1,planned,Rectum,57.9,NaN,Gy(RBE),%
1,planned,Rectum,58,NaN,Gy(RBE),%
1,planned,Rectum,58.1,NaN,Gy(RBE),%
1,planned,Rectum,58.2,NaN,Gy(RBE),%
1,planned,Rectum,58.3,NaN,Gy(RBE),%

3 个答案:

答案 0 :(得分:1)

当熊猫在系列中有一行没有索引0或数据框没有名为0的列时,就会发生键错误0。

就您而言,这是系列。考虑示例

df = pd.DataFrame({'vl':[1,2,3,4],'bh':[5,6,4,7]},index=[10,11,12,13])

df['bh'][0] #<-- leads to key error zero as the index doesn't contain 0. 

因此您可以将其更改为df['bh'].iloc[0],这将返回5,也可以将其更改为df['bh'].values[0],这将返回相同的结果。

您的情况应该是VolValues.iloc[loop]VolValues.values[loop]

答案 1 :(得分:0)

VolValues[loop]可能不想将零作为索引,很可能需要从1(一个)开始。类似于:

for loop in range(1,len(VolValues)):

答案 2 :(得分:0)

Reddit user Nikota commented就像我试图做的那样,仅提取值而不保留索引信息,只需要使用.values后缀。这似乎解决了我的问题。

因此,以下代码似乎可以正常工作:

# Import Rectum DVH
import pandas
import numpy
df = pandas.read_csv('/Users/daniel/Documents/data/DVH/RectumData.csv',
                 delimiter=',',header=0)
# Calculate D_2%, defined by ICRU 78 as "the greatest dose which all but
# 2 percent of a [volume of interest] receives." aka D_{near-max}    
def interpD2(disttype):
# Loop through all patients' plans.
    Dose2Results = numpy.zeros(40)
    for num in range(0,40): 
# We know a priori that there is no DVH data with Volume = 2. Hence we look for
# the two columns less than and greater than Volume = 2.
        DoseValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == disttype)].Dose.values
        VolValues = df.loc[(df['StudyID'] == num+1) & (df['DistributionType'] == disttype)].Volume.values
        for loop in range(0,len(VolValues)):
            if (VolValues[loop] > 2) and (VolValues[loop+1] < 2):
                LowerVolumeIndex,UpperVolumeIndex = loop,loop+1
                x0,x1,x2 = 2,VolValues[LowerVolumeIndex],VolValues[UpperVolumeIndex]
                y1,y2 = DoseValues[LowerVolumeIndex],DoseValues[UpperVolumeIndex]
                Dose2Results[num] = y1 - ((x1-x0)/(x2 - x1))*(y2 - y1)
    return Dose2Results
D2Planned = interpD2('planned')
D2Blurred = interpD2('blurred')