压痕错误没有意义 - Python

时间:2017-08-08 13:12:31

标签: python python-2.7 pandas dataframe nested-loops

当我试图将这个for循环嵌套时,我一直收到一个缩进错误,我不明白为什么。它运行时,我将它取消嵌套到"对于搜索行中的l [i-3:i + 3]:"我刚刚学习,所以我理解它可能不是最简洁的代码。谢谢你的时间。

    ne=pd.DataFrame()

    for word in keywordList:    
        for i, line in enumerate(searchlines):
          if word in line:
            for l in searchlines[i-3:i+3]: oFile.write(fileName + delim + word + delim +str(i) + delim +str(l)) ## prints Unused MidFile for Reference
    ### Creates normal case line for Named Entity recognition & all Lower Case line for flagging keywords
            #normalCaseLine = str(searchlines[i-3] + searchlines[i-2] + searchlines[i-1] + searchlines[i] + searchlines[i+1] + searchlines[i+2]  + searchlines[i+3]).replace('\n','  ')
            normalCaseLine = searchlines[i-3].rstrip('\n') + searchlines[i-2].rstrip('\n') + searchlines[i-1].rstrip('\n') + searchlines[i].rstrip('\n')  + searchlines[i+1].rstrip('\n') + searchlines[i+2].rstrip('\n') + searchlines[i+3].rstrip('\n')
            lowerCaseLine = normalCaseLine.lower()
            result = dict((key, len(list(group))) for key, group in groupby(sorted(lowerCaseLine.split())))
    ### Get Detail Keywords
            cleanResult = {word: result[word] for word in result if word in detailKeywordList}
            cleanNormLine = normalCaseLine.replace('\x92s',' ').replace('\x92s',' ').replace('\x96',' ').replace('\x95',' ')
    ### Enter here if we need to seperate words ex. Tech Keywords

            ner_output = st.tag(str(cleanNormLine).split())
            chunked, pos = [], prev_tag=[], ""
    ERROR IS HERE!!
                    for i, word_pos in enumerate(ner_output):
                               word, pos = word_pos
                               if pos in ['PERSON', 'ORGANIZATION', 'LOCATION','DATE','MONEY'] and pos == prev_tag:
                                   chunked[-1]+=word_pos
                               else:
                                              chunked.append(word_pos)
                                              prev_tag = pos

            clean_chunkedd = [tuple([" ".join(wordpos[::2]), wordpos[-1]]) if len(wordpos)!=2 else wordpos for wordpos in chunked]

    ### Write Results to Combined file
            midFile.write(fileName+'-'+str(i)+delim+word+delim+str(cleanNormLine)+delim+str(str(cleanResult).split(','))+'\n')
    ### Create NER DataFramez

            nerTagDF =  DataFrame(clean_chunkedd, columns = ['Word','Classification'])
            nerTagDF['File_Name'] = fileName
            nerTagDF['Line Key'] = str(i)
            ne = ne.append(nerTagDF) 

    oFile.close()
    midFile.close()

5 个答案:

答案 0 :(得分:1)

在错误消息

附近给出了一个额外的小标签空间
# -*- coding: utf-8 -*-

""" 创建于08年8月8日星期二18:48:40

@author:user """

NE = pd.DataFrame()

for word in keywordList:    
    for i, line in enumerate(searchlines):
      if word in line:
        for l in searchlines[i-3:i+3]: oFile.write(fileName + delim + word + delim +str(i) + delim +str(l)) ## prints Unused MidFile for Reference
### Creates normal case line for Named Entity recognition & all Lower Case line for flagging keywords
        #normalCaseLine = str(searchlines[i-3] + searchlines[i-2] + searchlines[i-1] + searchlines[i] + searchlines[i+1] + searchlines[i+2]  + searchlines[i+3]).replace('\n','  ')
        normalCaseLine = searchlines[i-3].rstrip('\n') + searchlines[i-2].rstrip('\n') + searchlines[i-1].rstrip('\n') + searchlines[i].rstrip('\n')  + searchlines[i+1].rstrip('\n') + searchlines[i+2].rstrip('\n') + searchlines[i+3].rstrip('\n')
        lowerCaseLine = normalCaseLine.lower()
        result = dict((key, len(list(group))) for key, group in groupby(sorted(lowerCaseLine.split())))
### Get Detail Keywords
        cleanResult = {word: result[word] for word in result if word in detailKeywordList}
        cleanNormLine = normalCaseLine.replace('\x92s',' ').replace('\x92s',' ').replace('\x96',' ').replace('\x95',' ')
### Enter here if we need to seperate words ex. Tech Keywords

        ner_output = st.tag(str(cleanNormLine).split())
        chunked, pos = [], prev_tag=[], ""
#ERROR IS HERE!!
                for i, word_pos in enumerate(ner_output):
                    word, pos = word_pos
                    if pos in ['PERSON', 'ORGANIZATION', 'LOCATION','DATE','MONEY'] and pos == prev_tag:
                        chunked[-1]+=word_pos
                    else:
                        chunked.append(word_pos)
                        prev_tag = pos

        clean_chunkedd = [tuple([" ".join(wordpos[::2]), wordpos[-1]]) if len(wordpos)!=2 else wordpos for wordpos in chunked]

### Write Results to Combined file
        midFile.write(fileName+'-'+str(i)+delim+word+delim+str(cleanNormLine)+delim+str(str(cleanResult).split(','))+'\n')
### Create NER DataFramez

        nerTagDF =  DataFrame(clean_chunkedd, columns = ['Word','Classification'])
        nerTagDF['File_Name'] = fileName
        nerTagDF['Line Key'] = str(i)
        ne = ne.append(nerTagDF) 

oFile.close()
midFile.close()

请检查代码...希望它能解决您的问题......

答案 1 :(得分:0)

你的for循环没有嵌套在任何东西中。在Python中,您使用缩进的一个地方是在某个范围内定义一大块代码,例如在whilefor循环内,if语句中或定义时例如,某些功能。

你的for循环在chunked, pos = [], prev_tag=[], ""之后缩进,这只是一个独立的语句。因此,您需要取消缩进循环,直到它与周围代码的缩进相匹配。这可能是我能说的最普遍的方式。

答案 2 :(得分:0)

请清理您的代码。在错误消息之后看起来错误就是您的行:

...
chunked, pos = [], prev_tag=[], ""
ERROR IS HERE!!
                for i, word_pos in enumerate(ner_output):
...

为什么循环线如此过度缩进?它应该是这样的:

chunked, pos = [], prev_tag=[], ""
# ERROR IS HERE!!
for i, word_pos in enumerate(ner_output):
    # and then pick it up at this indentation

但是你的缩进是不一致的。看这里,开头:

1    for word in keywordList:    
2        for i, line in enumerate(searchlines):
3          if word in line:
4            for l in searchlines[i-3:i+3]: oFile.write(fileName + delim + word + delim +str(i) + delim +str(l)) ## prints Unused MidFile for Reference
5    ### Creates normal case line for Named Entity recognition & all Lower Case line for flagging keywords
6            #normalCaseLine = str(searchlines[i-3] + searchlines[i-2] + searchlines[i-1] + searchlines[i] + searchlines[i+1] + searchlines[i+2]  + searchlines[i+3]).replace('\n','  ')
7            normalCaseLine = searchlines[

让我们逐行完成:第1行缩进四个空格。这意味着从现在开始的每个缩进都应该是四的倍数。第2行:还有四个空格,到目前为止还不错。第三行:从第2行缩进的只有两个空格:不好。应该是比第2行更多缩进的四个空格。第4行:再次,两个空格。你应该把它分成两行,甚至三行,用自己的行评论。 for(...):,然后oFile.write(...在另一行上有四个空格缩进。第6行和第7行看起来应该缩进,因为第4行令人困惑。第5行,即使它是注释,也应该进一步缩进以匹配前一行,除非前一行是for(...):行,在这种情况下它应该缩进四行。

答案 3 :(得分:0)

首次使用Python时,通常会出现缩进错误。

我建议您只使用制表来缩进代码 或空格。 因为在编译时,python编译器无法处理两者的混合。

我没有检查你的代码,但它不可读。 您可能希望使用具有PEP8 Autoformat规则的编辑器或自行重新格式化代码。

以下是您的代码的可编译版本:

ne=pd.DataFrame()

for word in keywordList:    
    for i, line in enumerate(searchlines):
        if word in line:
            for l in searchlines[i-3:i+3]: 
                oFile.write(fileName + delim + word + delim +str(i) + delim +str(l)) 
                ## prints Unused MidFile for Reference
                ### Creates normal case line for Named Entity recognition & all Lower Case line for flagging keywords
                #normalCaseLine = str(searchlines[i-3] + searchlines[i-2] + searchlines[i-1] + searchlines[i] + searchlines[i+1] + searchlines[i+2]  + searchlines[i+3]).replace('\n','  ')
                normalCaseLine = searchlines[i-3].rstrip('\n') + searchlines[i-2].rstrip('\n') + searchlines[i-1].rstrip('\n') + \
                    searchlines[i].rstrip('\n')  + searchlines[i+1].rstrip('\n') + searchlines[i+2].rstrip('\n') + \
                    searchlines[i+3].rstrip('\n')
                lowerCaseLine = normalCaseLine.lower()
                result = dict((key, len(list(group))) for key, group in groupby(sorted(lowerCaseLine.split())))
                ### Get Detail Keywords
                cleanResult = {word: result[word] for word in result if word in detailKeywordList}
                cleanNormLine = normalCaseLine.replace('\x92s',' ').replace('\x92s',' ').replace('\x96',' ').replace('\x95',' ')
                ### Enter here if we need to seperate words ex. Tech Keywords

                ner_output = st.tag(str(cleanNormLine).split())
                chunked, pos = [], prev_tag=[], ""
                for i, word_pos in enumerate(ner_output):
                    word, pos = word_pos
                    if pos in ['PERSON', 'ORGANIZATION', 'LOCATION','DATE','MONEY'] and pos == prev_tag:
                        chunked[-1]+=word_pos
                    else:
                        chunked.append(word_pos)
                        prev_tag = pos

                clean_chunkedd = [tuple([" ".join(wordpos[::2]), wordpos[-1]]) if len(wordpos)!=2 else wordpos for wordpos in chunked]

        ### Write Results to Combined file
        midFile.write(fileName+'-'+str(i)+delim+word+delim+str(cleanNormLine)+delim+str(str(cleanResult).split(','))+'\n')
        ### Create NER DataFramez

        nerTagDF =  DataFrame(clean_chunkedd, columns = ['Word','Classification'])
        nerTagDF['File_Name'] = fileName
        nerTagDF['Line Key'] = str(i)
        ne = ne.append(nerTagDF) 

oFile.close()
midFile.close()

答案 4 :(得分:0)

这是修复了缩进的代码。您应确保代码中的缩进始终一致,而不是空格和制表符的混合。

ne=pd.DataFrame()

for word in keywordList:    
    for i, line in enumerate(searchlines):
        if word in line:
            for l in searchlines[i-3:i+3]: oFile.write(fileName + delim + word + delim +str(i) + delim +str(l)) ## prints Unused MidFile for Reference
            ### Creates normal case line for Named Entity recognition & all Lower Case line for flagging keywords
                #normalCaseLine = str(searchlines[i-3] + searchlines[i-2] + searchlines[i-1] + searchlines[i] + searchlines[i+1] + searchlines[i+2]  + searchlines[i+3]).replace('\n','  ')
            normalCaseLine = searchlines[i-3].rstrip('\n') + searchlines[i-2].rstrip('\n') + searchlines[i-1].rstrip('\n') + searchlines[i].rstrip('\n')  + searchlines[i+1].rstrip('\n') + searchlines[i+2].rstrip('\n') + searchlines[i+3].rstrip('\n')
            lowerCaseLine = normalCaseLine.lower()
            result = dict((key, len(list(group))) for key, group in groupby(sorted(lowerCaseLine.split())))
            ### Get Detail Keywords
            cleanResult = {word: result[word] for word in result if word in detailKeywordList}
            cleanNormLine = normalCaseLine.replace('\x92s',' ').replace('\x92s',' ').replace('\x96',' ').replace('\x95',' ')
            ### Enter here if we need to seperate words ex. Tech Keywords

            ner_output = st.tag(str(cleanNormLine).split())
            chunked, pos = [], prev_tag=[], ""

                for i, word_pos in enumerate(ner_output):
                    word, pos = word_pos
                    if pos in ['PERSON', 'ORGANIZATION', 'LOCATION','DATE','MONEY'] and pos == prev_tag:
                            chunked[-1]+=word_pos
                    else:
                                chunked.append(word_pos)
                                prev_tag = pos

            clean_chunkedd = [tuple([" ".join(wordpos[::2]), wordpos[-1]]) if len(wordpos)!=2 else wordpos for wordpos in chunked]

            ### Write Results to Combined file
            midFile.write(fileName+'-'+str(i)+delim+word+delim+str(cleanNormLine)+delim+str(str(cleanResult).split(','))+'\n')
            ### Create NER DataFramez

            nerTagDF =  DataFrame(clean_chunkedd, columns = ['Word','Classification'])
            nerTagDF['File_Name'] = fileName
            nerTagDF['Line Key'] = str(i)
            ne = ne.append(nerTagDF) 

oFile.close()
midFile.close()