我有一个.CSV格式的数据集。我正在使用pandas,read_csv()然后
来阅读它我不得不删除指定列为NaN的行。到目前为止,我已经测试了数据集变量之前和之后的长度,它似乎删除了行。
import pandas as pd
dataset = pd.read_csv('/Users/ozercevikaslan/Desktop/MyMLWorkSpace/Womens Clothing E-Commerce Reviews.csv')
dataset.drop('Unnamed: 0', axis=1, inplace=True)
dataset.drop('Clothing ID', axis=1, inplace=True)
dataset.drop('Department Name', axis=1, inplace=True)
dataset.drop('Class Name', axis=1, inplace=True)
dataset.drop('Division Name', axis=1, inplace=True)
dataset.drop('Title', axis=1, inplace=True)
dataset = dataset.dropna(subset=['Review Text'], inplace=True)
稍后,我不得不使用数据集进行处理,但是当我尝试访问之前删除的行时,它会返回错误。但是,这没有任何意义,我删除了那些行,数据集的长度甚至改变了。怎么可能?
dataset['Review Text'][92]
Traceback (most recent call last):
File "/Users/ozercevikaslan/anaconda/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2885, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-326-b61838e48681>", line 1, in <module>
dataset['Review Text'][92]
File "/Users/ozercevikaslan/anaconda/lib/python3.5/site-packages/pandas/core/series.py", line 623, in __getitem__
result = self.index.get_value(self, key)
File "/Users/ozercevikaslan/anaconda/lib/python3.5/site-packages/pandas/core/indexes/base.py", line 2560, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas/_libs/index.pyx", line 83, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 91, in pandas._libs.index.IndexEngine.get_value
File "pandas/_libs/index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 811, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 817, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 92
你能帮我解决一下如何正确删除这些行吗?