Question

我跟随Wes McKinney＆＃34; Python for Data Analysis＆＃34;中的示例。

在第2章中，我们被要求计算每个时区出现在“tz＆＃39;”中的次数。位置，某些条目没有“tz＆＃39;”。

麦金尼＆＃34; America / New_York＆＃34;出现在1251（前10/3440行中有2个，正如你在下面看到的那样），而我的出现是1.试图找出它显示为什么＆＃39; 1＆＃39;？

我正在使用Python 2.7，在Enthought的文本中安装了McKinney的指令（epd-7.3-1-win-x86_64.msi）。数据来自https://github.com/Canuckish/pydata-book/tree/master/ch02。如果你不能从书的标题中说出我是Python的新手，那么请提供如何获取我没有提供的任何信息的说明。

import json

path = 'usagov_bitly_data2012-03-16-1331923249.txt'

open(path).readline()

records = [json.loads(line) for line in open(path)]
records[0]
records[1]
print records[0]['tz']

此处的最后一行将显示＆＃39; America / New_York＆＃39;，记录的模拟[1]显示＆＃39; America / Denver＆＃39;

#count unique time zones rating movies
#NOTE: NOT every JSON entry has a tz, so first line won't work
time_zones = [rec['tz'] for rec in records]

time_zones = [rec['tz'] for rec in records if 'tz' in rec]
time_zones[:10]

显示前十个时区条目，其中8-10为空白......

#counting using a dict to store counts
def get_counts(sequence):
    counts = {}
        for x in sequence:
        if x in counts:
            counts[x] += 1
        else:
            counts[x] = 1
        return counts

counts = get_counts(time_zones)
counts['America/New_York']

这= 1，但应该是1251

len(time_zones)

这= 3440，因为它应该

Answer 1

'America/New_York'时区在输入中出现1251次：

import json
from collections import Counter

with open(path) as file:
    c = Counter(json.loads(line).get('tz') for line in file)
print(c['America/New_York']) # -> 1251

目前尚不清楚为什么您的代码的计数为1。也许代码缩进不正确：

def get_counts(sequence):
    counts = {}
    for x in sequence:
        if x in counts:
            counts[x] += 1
    else: #XXX wrong indentation
        counts[x] = 1 # it is run after the loop if there is no `break` 
    return counts

请参阅Why does python use 'else' after for and while loops?

正确的缩进应该是：

def get_counts(sequence):
    counts = {}
    for x in sequence:
        if x in counts:
            counts[x] += 1
        else: 
            counts[x] = 1 # it is run every iteration if x not in counts
    return counts

检查您是否混合使用空格和制表符进行缩进，使用python -tt运行您的脚本以查找。

示例来自＆＃34; Python for Data Analysis＆＃34;，第2章

1 个答案: