我一直在尝试获取这段python代码,以在计算机上打开目录并读取其内容,因此我可以为分配生成输出,但是我一直在获取“无效的\ x转义”。我的语法有误,或者我的正斜杠和反斜杠混在一起了。
import sys,os,re
import time
tokens = 0
documents = 0
terms = 0
termindex = 0
docindex = 0
#
alltokens = []
alldocs = []
#
#
t2 = time.localtime()
#
dirname = "C:\Users\xhenr\Documents\cs3308\cacm"
#
all = [f for f in os.listdir(dirname)]
for f in all:
documents+=1
with open('C:\Users\xhenr\Documents\cs3308\cacm/f', 'r') as myfile:
alldocs.append(f)
data=myfile.read().replace('\n', '')
for token in data.split():
alltokens.append(token)
tokens+=1
#
documentfile = open('C:/Users/xhenr/Documents/cs3308/cacm/documents.dat', 'w')
alldocs.sort()
for f in alldocs:
docindex += 1
documentfile.write(f+','+str(docindex)+os.linesep)
documentfile.close()
#
alltokens.sort()
#
g=[]
#
for i in alltokens:
if i not in g:
g.append(i)
terms+=1
terms = len(g
)
indexfile = open('C:/Users/xhenr/Documents/cs3308/cacm/index.dat', 'w')
for i in g:
termindex += 1
indexfile.write(i+','+str(termindex)+os.linesep)
indexfile.close()
#
print 'Processing Start Time: %.2d:%.2d' % (t2.tm_hour, t2.tm_min)
print "Documents %i" % documents
print "Tokens %i" % tokens
print "Terms %i" % terms
t2 = time.localtime()
print 'Processing End Time: %.2d:%.2d' % (t2.tm_hour, t2.tm_min)
答案 0 :(得分:0)
这里:
dirname = "C:\Users\xhenr\Documents\cs3308\cacm"
Python实际上将反斜杠解释为尝试转义以下字符,而实际上它是系统路径。您可以通过转义反斜杠来解决此问题,但是有一种更简单的方法:
dirname = r"C:\Users\xhenr\Documents\cs3308\cacm"
通过在前面放置r
,您可以告诉Python将字符串按原样处理,而没有任何转义字符。 (r
代表raw。)这也意味着您也必须更改此行:
with open('C:\Users\xhenr\Documents\cs3308\cacm/f', 'r') as myfile:
更改为:
with open(r'C:\Users\xhenr\Documents\cs3308\cacm\f', 'r') as myfile:
(还改变了前后斜杠用法的不一致。