如何在熊猫数据框中修复KeyError:“文本”?

时间:2018-09-30 17:53:41

标签: python pandas keras lstm

我有一个很大的推文数据集,我已经对其进行了预处理。在这个清理过的csv文件中,有两列,分别是索引和文本。我一直在尝试对这些数据进行训练,但是每次尝试使用这些数据时,我都会遇到关键错误。

Traceback (most recent call last):
  File "sentiment_classifier.py", line 17, in <module>
    tweet_text = twitter_data['text']
  File "C:\Users\Aeryes\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\frame.py", line 2688, in __getitem__
    return self._getitem_column(key)
  File "C:\Users\Aeryes\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\frame.py", line 2695, in _getitem_column
    return self._get_item_cache(key)
  File "C:\Users\Aeryes\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\generic.py", line 2489, in _get_item_cache
    values = self._data.get(item)
  File "C:\Users\Aeryes\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\internals.py", line 4115, in get
    loc = self.items.get_loc(item)
  File "C:\Users\Aeryes\AppData\Local\Programs\Python\Python36\lib\site-packages\pandas\core\indexes\base.py", line 3080, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))
  File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 164, in pandas._libs.index.IndexEngine.get_loc
KeyError: 'text'

我已经尝试了以下方法以使其按预期工作:

  • 我注意到在数据集中的文本列中有160万个条目中的约3200个空白区域。我尝试使用twitter_data.dropna(inplace=True)删除这些空白字段,然后再将数据输入LSTM。

  • 我还尝试过tweet_text = twitter_data['text'] = [tweet.get('text','') for tweet in twitter_data if tweet.isaplha()],以在出现此错误后过滤出文本字段中的所有int类型:

sys:1:DtypeWarning:列(0)具有混合类型。在导入时指定dtype选项,或将low_memory = False设置为false。     追溯(最近一次通话):       文件“ sentiment_classifier.py”,第17行,在         tweet_text = twitter_data ['text'] = [tweet.get('text','')用于twitter_data中的tweet]       文件“ sentiment_classifier.py”,第17行,在         tweet_text = twitter_data ['text'] = [tweet.get('text','')用于twitter_data中的tweet]     AttributeError:“ int”对象没有属性“ get”

我不知道如何前进并解决这个问题。请帮助我。

我添加了打印twitter_data.keys()时遇到的错误

sys:1: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
Int64Index([0, 1], dtype='int64')

我打印twitter_data时得到以下输出:

sys:1: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
               0                                                  1
0          index                                               text
1              0  awww that bummer you shoulda got david carr of...
2              1  is upset that he can not update his facebook b...
3              2  dived many times for the ball managed to save ...
4              3     my whole body feels itchy and like its on fire
5              4  no it not behaving at all mad why am here beca...
6              5                                 not the whole crew
7              6                                           need hug
8              7  hey long time no see yes rains bit only bit lo...
9              8                          nope they did not have it
10             9                                       que me muera
11            10              spring break in plain city it snowing
12            11                            just re pierced my ears
13            12  could not bear to watch it and thought the ua ...
14            13  it it counts idk why did either you never talk...
15            14  would ve been the first but did not have gun n...
16            15  wish got to watch it with you miss you and how...
17            16  hollis death scene will hurt me severely to wa...
18            17                                about to file taxes
19            18  ahh ive always wanted to see rent love the sou...
20            19  oh dear were you drinking out of the forgotten...
21            20   was out most of the day so did not get much done
22            21  one of my friend called me and asked to meet w...
23            22                         baked you cake but ated it
24            23                this week is not going as had hoped
25            24                            blagh class at tomorrow
26            25          hate when have to call and wake people up
27            26  just going to cry myself to sleep after watchi...
28            27                              im sad now miss lilly
29            28  ooooh lol that leslie and ok will not do it ag...
...          ...                                                ...
1596012  1599969  you re the undisputed authority on the topic g...
1596013  1599970   thanks thanks that was just what was looking for
1596014  1599971  thanks martin not the most imaginative interfa...
1596015  1599972                            congrats mike way to go
1596016  1599973                    omg office space wanna steal it
1596017  1599974  ahaha nooo you were just away from everyone el...
1596018  1599975  hey baack and thanks so much for all those kin...
1596019  1599976     yeah my conscience would be clear in that case
1596020  1599977               thats my girl dishing out the advice
1596021  1599978                                        second that
1596022  1599979                                      in the garden
1596023  1599980         jo jen by nemuselo zrovna holce ael co nic
1596024  1599981                     another commenting contest yay
1596025  1599982  figured out how to see my tweets and facebook ...
1596026  1599983  theri tomorrow drinking coffee talking about o...
1596027  1599984  you heard it here first we re having girl hope...
1596028  1599985  if ur the lead singer in band beware falling p...
1596029  1599986                            too much ads on my blog
1596030  1599987  neveer think that you both will get on well wi...
1596031  1599988  ha good job that right we gotta throw that big...
1596032  1599989                              im glad ur doing well
1596033  1599990                                wooooo xbox is back
1596034  1599991  mmmm that sounds absolutely perfect but my sch...
1596035  1599992                   recovering from the long weekend
1596036  1599994  yeah that does work better than just waiting f...
1596037  1599995  just woke up having no school is the best feel...
1596038  1599996   thewdb com very cool to hear old walt interviews
1596039  1599997  are you ready for your mojo makeover ask me fo...
1596040  1599998  happy th birthday to my boo of alll time tupac...
1596041  1599999                               happy charitytuesday

[1596042 rows x 2 columns]

0 个答案:

没有答案