我是编码的新手,我还在学习。话虽如此,我一直在关注如何从twitter API进行数据分析的教程:http://adilmoujahid.com/posts/2014/07/twitter-analytics/
我相信他使用的是python 2.7而我使用的是python 3.6.1所以我已经将代码转换为我正在使用的python版本,到目前为止它一直有效,直到我进入前5个国家图表。具体来说,当我尝试运行两天前只工作过一次的前5个国家的代码时,现在我只收到以下错误消息:
<div class="box">
<h2>HUB</h2>
<p>test</p>
<button class="scopri"> more </button>
</div>
<div class="modalita">
<div class="modalita_box">
<p> try </p>
</div>
</div>
是否有其他人遇到此问题和/或什么是最佳解决方案?我无法弄清楚如何解决这个问题。谢谢!
整个代码(迄今为止)
"---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-47-601663476327> in <module>()
7 ax.set_ylabel('Number of tweets' , fontsize=15)
8 ax.set_title('Top 5 countries', fontsize=15, fontweight='bold')
----> 9 tweets_by_country[:5].plot(ax=ax, kind='bar', color='blue')
10 plt.show()
~/Environments/Environments/my_env/lib/python3.6/site- packages/pandas/plotting/_core.py in __call__(self, kind, ax, figsize, use_index, title, grid, legend, style, logx, logy, loglog, xticks, yticks, xlim, ylim, rot, fontsize, colormap, table, yerr, xerr, label, secondary_y, **kwds)
2441 colormap=colormap, table=table, yerr=yerr,
2442 xerr=xerr, label=label, secondary_y=secondary_y,
-> 2443 **kwds)
2444 __call__.__doc__ = plot_series.__doc__
2445
~/Environments/Environments/my_env/lib/python3.6/site-packages/pandas/plotting/_core.py in plot_series(data, kind, ax, figsize, use_index, title, grid, legend, style, logx, logy, loglog, xticks, yticks, xlim, ylim, rot, fontsize, colormap, table, yerr, xerr, label, secondary_y, **kwds)
1882 yerr=yerr, xerr=xerr,
1883 label=label, secondary_y=secondary_y,
-> 1884 **kwds)
1885
1886
~/Environments/Environments/my_env/lib/python3.6/site-packages/pandas/plotting/_core.py in _plot(data, x, y, subplots, ax, kind, **kwds)
1682 plot_obj = klass(data, subplots=subplots, ax=ax, kind=kind, **kwds)
1683
-> 1684 plot_obj.generate()
1685 plot_obj.draw()
1686 return plot_obj.result
~/Environments/Environments/my_env/lib/python3.6/site-packages/pandas/plotting/_core.py in generate(self)
236 def generate(self):
237 self._args_adjust()
--> 238 self._compute_plot_data()
239 self._setup_subplots()
240 self._make_plot()
~/Environments/Environments/my_env/lib/python3.6/site-packages/pandas/plotting/_core.py in _compute_plot_data(self)
345 if is_empty:
346 raise TypeError('Empty {0!r}: no numeric data to '
--> 347 'plot'.format(numeric_data.__class__.__name__))
348
349 self.data = numeric_data
TypeError: Empty 'DataFrame': no numeric data to plot"
import json
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt
tweets_data_path = '...twitter_data.txt'
tweets_data = []
tweets_file = open(tweets_data_path, "r")
for line in tweets_file:
try:
tweet = json.loads(line)
tweets_data.append(tweet)
except:
continue
print (len (tweets_data))
tweets = pd.DataFrame()
tweets['text'] = list(map(lambda tweet: tweet['text'], tweets_data))
tweets['lang'] = list(map(lambda tweet: tweet['lang'], tweets_data))
tweets['country'] = list(map(lambda tweet: tweet['place']['country'] if tweet['place'] != None else None, tweets_data))
tweets_by_lang = tweets['lang'].value_counts()
fig, ax = plt.subplots()
ax.tick_params(axis='x', labelsize=15)
ax.tick_params(axis='y', labelsize=10)
ax.set_xlabel('Languages', fontsize=15)
ax.set_ylabel('Number of tweets' , fontsize=15)
ax.set_title('Top 5 languages', fontsize=15, fontweight='bold')
tweets_by_lang[:5].plot(ax=ax, kind='bar', color='red')
plt.show()
答案 0 :(得分:1)
您的数据实际上是数字的吗?您可以使用例如
进行检查print(type(tweets['country'][0]))
鉴于您使用的是json.loads
(从字符串反序列化),它很可能不是数字的,这就是错误所指的含义。尝试将数据类型转换为浮点型(或其他类型):
tweets = tweets.astype('float')
,看看是否可以解决问题。如果需要,您也可以将此功能仅应用于特定的列。祝你好运!
答案 1 :(得分:0)
我认为您的文件不存在或存在路径问题。 前两个步骤http://adilmoujahid.com/posts/2014/07/twitter-analytics/检索文件并将其保存在本地。 该文件是否存在于指定的路径中?
tweets_data_path = '...twitter_data.txt'
以下内容会返回什么?
print (len (tweets_data))