WindowsError:[错误3]系统找不到指定的路径: 我试图将文件目录分配给基本路径,以便其余代码可以完成其工作。
Also let me know if you think there should be a : at the end of this line of code----->
for file in sorted(os.listdir(path))
i think should be ....
for file in sorted(os.listdir(path)):
the book doesnt have the : at the end
import pyprind #INSTALLED IN ANACONDA TERMINAL
import pandas as pd
import os
# change the 'basepath' to the directory of unzipped movie dataset
#tried:
#basepath = 'C:\\Users\\zacka\\Downloads\\aclImdb_v1.tar.gz'
#basepath = 'C://Users//zacka//Downloads//aclImdb_v1.tar.gz'
#basepath = 'C:/Users/zacka/Downloads/aclImdb_v1.tar.gz'
#basepath = 'C:\Users\zacka\Downloads\aclImdb_v1.tar.gz'
#not sure if im using the back slash or forward slash incorrectly or if i #need to double up....
labels = {'pos': 1, 'neg': 0}
pbar = pyprind.ProgBar(50000)
df = pd.DataFrame()
for s in ('test', 'train'):
for l in ('pos', 'neg'):
path = os.path.join(basepath, s, l)
for file in sorted(os.listdir(path))
with open(os.path.join(path, file),
'r', encoding='utf-8') as infile:
txt = infile.read()
df = df.append([[txt, labels[1]]],
ignore_index=True)
pbar.update()
df.columns = ['review', 'sentiment']
答案 0 :(得分:0)
basepath ='C:\\ Users \\ zacka \\ Downloads \\ aclImdb'
尤其是在'...之间需要双反斜杠。...\ aclImdb'我尝试了不带双反斜杠的print(basepath),它为aclImdb中的字符a产生了0x7。
我也将basepath =设置为压缩文件夹而不是未压缩文件夹。
现在我需要弄清楚: TypeError:“ encoding”是此函数的无效关键字参数 用于encoding ='utf-8'