问题:
我正在按照一个教程进行操作,并尝试对包含tweet(日期,用户名,tweet本身,tweet ID以及它是真还是假)的csv文件进行重新搜索。
这是我的原始代码:
import pandas as pd
import re
filename = 'sample.csv'
data = pd.read_csv(filename, encoding='utf-8')
print(data.info())
def word_in_text(word,text):
match = re.search(word,text)
if match:
return True
return False
[kai, hatsu] = [0, 0]
for index, row in data.iterrows():
kai += word_in_text('会', row['text'])
hatsu += word_in_text('初', row['text'])
这是它引发的错误:
Traceback (most recent call last):
File "C:\Python\enkousaiTF.py", line 28, in <module>
kai += word_in_text('会', row['text'])
File "C:\Python\enkousaiTF.py", line 19, in word_in_text
match = re.search(word,text)
File "C:\Python\Python36-32\lib\re.py", line 182, in search
return _compile(pattern, flags).search(string)
TypeError: expected string or bytes-like object
我试图解决的问题:
当我试图找出数据框的类型时,我得到了:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1001 entries, 0 to 1000
Data columns (total 5 columns):
date 1000 non-null object
username 1000 non-null object
text 1000 non-null object
id 1000 non-null float64
enko 1000 non-null object
dtypes: float64(1), object(4)
memory usage: 23.5+ KB
所以,我认为问题可能出在float64类型上,所以我尝试在此处添加str:
match = re.search(str(word,text))
但这只会引发另一个错误:
TypeError: decoding str is not supported
然后我尝试使用
dtype_dic= {'date': str,
'username' : str,
'text': str,
'id': str,
'enko': str}
但是,即使我检查了数据类型,它仍然会抛出TypeError: expected string or bytes-like object
如何解决此问题?