在excel表中查找单词,'str'不支持缓冲区接口

时间:2016-04-05 09:10:19

标签: python excel unicode

我想选择那些包含一些连词的句子。但是我收到的错误是:

Traceback (most recent call last):
  File "positive_process3.py", line 14, in <module>
    if word in text:
TypeError: 'str' does not support the buffer interface.

我的代码是:

import xlrd
from xlrd import open_workbook
import xlwt
wb = open_workbook("C:/Users/SA769740/Desktop/result2/pos.xlsx")
book = xlwt.Workbook(encoding="utf-8")
sheet1 = book.add_sheet("Sheet 1")
wordSet = [' for ', ' so ',' since ', ' Since ', ' because ', ' as ', ' As ', ' due to ', ' Due to ']
count=1
for sheet in wb.sheets():
    for row in range(sheet.nrows):
        text = ((sheet.cell(row,2).value).encode("utf-8"))
        l = ""
        for word in wordSet:
            if word in text:
                l += (word+" ")
        sheet1.write(row,0,sheet.cell(row, 0).value)
        sheet1.write(row,3, l)
        sheet1.write(row,4,count)
        sheet1.write(row,5,value)

        count += 1

book.save('C:/Users/SA769740/Desktop/result2/pos_reviews_process3.xls')

我正在使用python 3.4.3

1 个答案:

答案 0 :(得分:2)

使用Python 2.您正在使用Python 3,并且正在尝试将str对象与bytes对象进行比较。

解决方案是切换到Python 2,或者不在str.encode()值上使用text

text = sheet.cell(row, 2).value

即使修复了Python版本并在Python 2上运行它,您也应该在任何地方使用Unicode值,而不是将文本编码为UTF-8。当使用与UTF-8编码数据进行文本比较时,最终可能会出现部分字节序列匹配。