Question

在Python 2.7.6和pandas 0.17.0中，

我正在制作时间序列图。绘制字数v.s.时间。该脚本将读取推特数据并计算该词随时间出现的次数。这是一个示例pandas表，'per_minute'（第一列是'time'，第二列是'word count'。）：

 print type(per_minute)

 <class 'pandas.core.series.Series'>

 print per_minute

 2015-10-29 01:55:00    1
 2015-10-29 01:56:00    1
 2015-10-29 01:57:00    0
 Freq: T, dtype: float64

我正在尝试选择时间列（例如，“2015-10-29 01:57:00”，或“2015-10-29”，或“01:57:00”）所以我可以转储这次信息到matplotlib.pylot刻度标签。

这个脚本将读取数百个twitter json文件，因此第一列不完全是“2015-10-29 01:55:00 1”，这里只是一个例子。但它将采用“yyyy-mm-dd hh：mm：ss word_count”的方式。

 plt.axes().set_xticklabels(dum_the_info_here,rotation='50', fontsize=8)

我尝试了许多镜头：

print per_minute.loc[idx[1:1]] #Print: Series([], dtype: float64)

print per_minute.ix[:,0]  #IndexingError: Too many indexers

print per_minute.loc[:,per_minute ] #IndexingError: Too many indexers

print per_minute.loc[1:1] # TypeError: cannot do slice indexing on <class 'pandas.tseries.index.DatetimeIndex'> with these indexers [1] of <type 'int'>

任何一位大师都可以开导吗？谢谢！

Answer 1

Series上的index默认情况下使用matplotlib.AxesSubplot作为x轴，并返回.set_ticklabels，您可以

ax = df.plot()
ax.set_xticklabels(df.index, rotation=90)

这样：{ / p>

import re

def grep(pattern, block, context_lines=0):
    lines = block.splitlines()
    for line_number, line in enumerate(lines):
        if re.match(pattern, line):
            lines_with_context = lines[line_number - context_lines:line_number + context_lines + 1]
            yield '\n'.join(lines_with_context)

# Try it out
text_block = """One
Two
Three
abc defg
four
five
six
abc defoobar
seven
eight
abc de"""

pattern = 'abc de.*'

for line in grep(pattern, text_block, context_lines=2):
    print line
    print '---'

按位置选择日期列：#IndexingError：索引器太多AND TypeError：无法进行切片索引

1 个答案: