当我试图将这个for循环嵌套时,我一直收到一个缩进错误,我不明白为什么。它运行时,我将它取消嵌套到"对于搜索行中的l [i-3:i + 3]:"我刚刚学习,所以我理解它可能不是最简洁的代码。谢谢你的时间。
ne=pd.DataFrame()
for word in keywordList:
for i, line in enumerate(searchlines):
if word in line:
for l in searchlines[i-3:i+3]: oFile.write(fileName + delim + word + delim +str(i) + delim +str(l)) ## prints Unused MidFile for Reference
### Creates normal case line for Named Entity recognition & all Lower Case line for flagging keywords
#normalCaseLine = str(searchlines[i-3] + searchlines[i-2] + searchlines[i-1] + searchlines[i] + searchlines[i+1] + searchlines[i+2] + searchlines[i+3]).replace('\n',' ')
normalCaseLine = searchlines[i-3].rstrip('\n') + searchlines[i-2].rstrip('\n') + searchlines[i-1].rstrip('\n') + searchlines[i].rstrip('\n') + searchlines[i+1].rstrip('\n') + searchlines[i+2].rstrip('\n') + searchlines[i+3].rstrip('\n')
lowerCaseLine = normalCaseLine.lower()
result = dict((key, len(list(group))) for key, group in groupby(sorted(lowerCaseLine.split())))
### Get Detail Keywords
cleanResult = {word: result[word] for word in result if word in detailKeywordList}
cleanNormLine = normalCaseLine.replace('\x92s',' ').replace('\x92s',' ').replace('\x96',' ').replace('\x95',' ')
### Enter here if we need to seperate words ex. Tech Keywords
ner_output = st.tag(str(cleanNormLine).split())
chunked, pos = [], prev_tag=[], ""
ERROR IS HERE!!
for i, word_pos in enumerate(ner_output):
word, pos = word_pos
if pos in ['PERSON', 'ORGANIZATION', 'LOCATION','DATE','MONEY'] and pos == prev_tag:
chunked[-1]+=word_pos
else:
chunked.append(word_pos)
prev_tag = pos
clean_chunkedd = [tuple([" ".join(wordpos[::2]), wordpos[-1]]) if len(wordpos)!=2 else wordpos for wordpos in chunked]
### Write Results to Combined file
midFile.write(fileName+'-'+str(i)+delim+word+delim+str(cleanNormLine)+delim+str(str(cleanResult).split(','))+'\n')
### Create NER DataFramez
nerTagDF = DataFrame(clean_chunkedd, columns = ['Word','Classification'])
nerTagDF['File_Name'] = fileName
nerTagDF['Line Key'] = str(i)
ne = ne.append(nerTagDF)
oFile.close()
midFile.close()
答案 0 :(得分:1)
在错误消息
附近给出了一个额外的小标签空间# -*- coding: utf-8 -*-
""" 创建于08年8月8日星期二18:48:40
@author:user """
NE = pd.DataFrame()
for word in keywordList:
for i, line in enumerate(searchlines):
if word in line:
for l in searchlines[i-3:i+3]: oFile.write(fileName + delim + word + delim +str(i) + delim +str(l)) ## prints Unused MidFile for Reference
### Creates normal case line for Named Entity recognition & all Lower Case line for flagging keywords
#normalCaseLine = str(searchlines[i-3] + searchlines[i-2] + searchlines[i-1] + searchlines[i] + searchlines[i+1] + searchlines[i+2] + searchlines[i+3]).replace('\n',' ')
normalCaseLine = searchlines[i-3].rstrip('\n') + searchlines[i-2].rstrip('\n') + searchlines[i-1].rstrip('\n') + searchlines[i].rstrip('\n') + searchlines[i+1].rstrip('\n') + searchlines[i+2].rstrip('\n') + searchlines[i+3].rstrip('\n')
lowerCaseLine = normalCaseLine.lower()
result = dict((key, len(list(group))) for key, group in groupby(sorted(lowerCaseLine.split())))
### Get Detail Keywords
cleanResult = {word: result[word] for word in result if word in detailKeywordList}
cleanNormLine = normalCaseLine.replace('\x92s',' ').replace('\x92s',' ').replace('\x96',' ').replace('\x95',' ')
### Enter here if we need to seperate words ex. Tech Keywords
ner_output = st.tag(str(cleanNormLine).split())
chunked, pos = [], prev_tag=[], ""
#ERROR IS HERE!!
for i, word_pos in enumerate(ner_output):
word, pos = word_pos
if pos in ['PERSON', 'ORGANIZATION', 'LOCATION','DATE','MONEY'] and pos == prev_tag:
chunked[-1]+=word_pos
else:
chunked.append(word_pos)
prev_tag = pos
clean_chunkedd = [tuple([" ".join(wordpos[::2]), wordpos[-1]]) if len(wordpos)!=2 else wordpos for wordpos in chunked]
### Write Results to Combined file
midFile.write(fileName+'-'+str(i)+delim+word+delim+str(cleanNormLine)+delim+str(str(cleanResult).split(','))+'\n')
### Create NER DataFramez
nerTagDF = DataFrame(clean_chunkedd, columns = ['Word','Classification'])
nerTagDF['File_Name'] = fileName
nerTagDF['Line Key'] = str(i)
ne = ne.append(nerTagDF)
oFile.close()
midFile.close()
请检查代码...希望它能解决您的问题......
答案 1 :(得分:0)
你的for
循环没有嵌套在任何东西中。在Python中,您使用缩进的一个地方是在某个范围内定义一大块代码,例如在while
或for
循环内,if
语句中或定义时例如,某些功能。
你的for循环在chunked, pos = [], prev_tag=[], ""
之后缩进,这只是一个独立的语句。因此,您需要取消缩进循环,直到它与周围代码的缩进相匹配。这可能是我能说的最普遍的方式。
答案 2 :(得分:0)
请清理您的代码。在错误消息之后看起来错误就是您的行:
...
chunked, pos = [], prev_tag=[], ""
ERROR IS HERE!!
for i, word_pos in enumerate(ner_output):
...
为什么循环线如此过度缩进?它应该是这样的:
chunked, pos = [], prev_tag=[], ""
# ERROR IS HERE!!
for i, word_pos in enumerate(ner_output):
# and then pick it up at this indentation
但是你的缩进是不一致的。看这里,开头:
1 for word in keywordList:
2 for i, line in enumerate(searchlines):
3 if word in line:
4 for l in searchlines[i-3:i+3]: oFile.write(fileName + delim + word + delim +str(i) + delim +str(l)) ## prints Unused MidFile for Reference
5 ### Creates normal case line for Named Entity recognition & all Lower Case line for flagging keywords
6 #normalCaseLine = str(searchlines[i-3] + searchlines[i-2] + searchlines[i-1] + searchlines[i] + searchlines[i+1] + searchlines[i+2] + searchlines[i+3]).replace('\n',' ')
7 normalCaseLine = searchlines[
让我们逐行完成:第1行缩进四个空格。这意味着从现在开始的每个缩进都应该是四的倍数。第2行:还有四个空格,到目前为止还不错。第三行:从第2行缩进的只有两个空格:不好。应该是比第2行更多缩进的四个空格。第4行:再次,两个空格。你应该把它分成两行,甚至三行,用自己的行评论。 for(...):
,然后oFile.write(...
在另一行上有四个空格缩进。第6行和第7行看起来应该缩进,因为第4行令人困惑。第5行,即使它是注释,也应该进一步缩进以匹配前一行,除非前一行是for(...):
行,在这种情况下它应该缩进四行。
答案 3 :(得分:0)
首次使用Python时,通常会出现缩进错误。
我建议您只使用制表来缩进代码 或空格。 因为在编译时,python编译器无法处理两者的混合。
我没有检查你的代码,但它不可读。 您可能希望使用具有PEP8 Autoformat规则的编辑器或自行重新格式化代码。
以下是您的代码的可编译版本:
ne=pd.DataFrame()
for word in keywordList:
for i, line in enumerate(searchlines):
if word in line:
for l in searchlines[i-3:i+3]:
oFile.write(fileName + delim + word + delim +str(i) + delim +str(l))
## prints Unused MidFile for Reference
### Creates normal case line for Named Entity recognition & all Lower Case line for flagging keywords
#normalCaseLine = str(searchlines[i-3] + searchlines[i-2] + searchlines[i-1] + searchlines[i] + searchlines[i+1] + searchlines[i+2] + searchlines[i+3]).replace('\n',' ')
normalCaseLine = searchlines[i-3].rstrip('\n') + searchlines[i-2].rstrip('\n') + searchlines[i-1].rstrip('\n') + \
searchlines[i].rstrip('\n') + searchlines[i+1].rstrip('\n') + searchlines[i+2].rstrip('\n') + \
searchlines[i+3].rstrip('\n')
lowerCaseLine = normalCaseLine.lower()
result = dict((key, len(list(group))) for key, group in groupby(sorted(lowerCaseLine.split())))
### Get Detail Keywords
cleanResult = {word: result[word] for word in result if word in detailKeywordList}
cleanNormLine = normalCaseLine.replace('\x92s',' ').replace('\x92s',' ').replace('\x96',' ').replace('\x95',' ')
### Enter here if we need to seperate words ex. Tech Keywords
ner_output = st.tag(str(cleanNormLine).split())
chunked, pos = [], prev_tag=[], ""
for i, word_pos in enumerate(ner_output):
word, pos = word_pos
if pos in ['PERSON', 'ORGANIZATION', 'LOCATION','DATE','MONEY'] and pos == prev_tag:
chunked[-1]+=word_pos
else:
chunked.append(word_pos)
prev_tag = pos
clean_chunkedd = [tuple([" ".join(wordpos[::2]), wordpos[-1]]) if len(wordpos)!=2 else wordpos for wordpos in chunked]
### Write Results to Combined file
midFile.write(fileName+'-'+str(i)+delim+word+delim+str(cleanNormLine)+delim+str(str(cleanResult).split(','))+'\n')
### Create NER DataFramez
nerTagDF = DataFrame(clean_chunkedd, columns = ['Word','Classification'])
nerTagDF['File_Name'] = fileName
nerTagDF['Line Key'] = str(i)
ne = ne.append(nerTagDF)
oFile.close()
midFile.close()
答案 4 :(得分:0)
这是修复了缩进的代码。您应确保代码中的缩进始终一致,而不是空格和制表符的混合。
ne=pd.DataFrame()
for word in keywordList:
for i, line in enumerate(searchlines):
if word in line:
for l in searchlines[i-3:i+3]: oFile.write(fileName + delim + word + delim +str(i) + delim +str(l)) ## prints Unused MidFile for Reference
### Creates normal case line for Named Entity recognition & all Lower Case line for flagging keywords
#normalCaseLine = str(searchlines[i-3] + searchlines[i-2] + searchlines[i-1] + searchlines[i] + searchlines[i+1] + searchlines[i+2] + searchlines[i+3]).replace('\n',' ')
normalCaseLine = searchlines[i-3].rstrip('\n') + searchlines[i-2].rstrip('\n') + searchlines[i-1].rstrip('\n') + searchlines[i].rstrip('\n') + searchlines[i+1].rstrip('\n') + searchlines[i+2].rstrip('\n') + searchlines[i+3].rstrip('\n')
lowerCaseLine = normalCaseLine.lower()
result = dict((key, len(list(group))) for key, group in groupby(sorted(lowerCaseLine.split())))
### Get Detail Keywords
cleanResult = {word: result[word] for word in result if word in detailKeywordList}
cleanNormLine = normalCaseLine.replace('\x92s',' ').replace('\x92s',' ').replace('\x96',' ').replace('\x95',' ')
### Enter here if we need to seperate words ex. Tech Keywords
ner_output = st.tag(str(cleanNormLine).split())
chunked, pos = [], prev_tag=[], ""
for i, word_pos in enumerate(ner_output):
word, pos = word_pos
if pos in ['PERSON', 'ORGANIZATION', 'LOCATION','DATE','MONEY'] and pos == prev_tag:
chunked[-1]+=word_pos
else:
chunked.append(word_pos)
prev_tag = pos
clean_chunkedd = [tuple([" ".join(wordpos[::2]), wordpos[-1]]) if len(wordpos)!=2 else wordpos for wordpos in chunked]
### Write Results to Combined file
midFile.write(fileName+'-'+str(i)+delim+word+delim+str(cleanNormLine)+delim+str(str(cleanResult).split(','))+'\n')
### Create NER DataFramez
nerTagDF = DataFrame(clean_chunkedd, columns = ['Word','Classification'])
nerTagDF['File_Name'] = fileName
nerTagDF['Line Key'] = str(i)
ne = ne.append(nerTagDF)
oFile.close()
midFile.close()