我正在尝试将 docx 文件读入 google collab,因为我的装有 anaconda 的主计算机因维护而消失。我正在尝试使用 python-docx 模块,但据我所知,我不能在 google collab 中 pip install python-docx
'''
import docx
def getText(filename):
doc = docx.Document(filename)
fullText = []
for para in doc.paragraphs:
fullText.append(para.text)
return '\n'.join(fullText)
docxString = getText("week_8_document1.docx")
'''
有什么想法吗?
答案 0 :(得分:0)
尝试以下操作;希望它有效:
#Install python-docx
!pip install python-docx #<-- Yes you can directly install in Colab
#Import the tools
import docx
from google.colab import files
uploaded = files.upload() #<-- Select the file you want to upload
file_name = '[whatever your file is called here].docx' #<-- Change filename to your file
doc = docx.Document(file_name)
加载文档后,您可以按段落或表格等访问文本。祝老板好运