找到Python中活着人数最多的年份

时间:2015-07-20 17:15:42

标签: python algorithm

根据他们的出生年龄和结束年份(在19002000之间)的人员列表,找到人数最多的年份。

这是我有点蛮力的解决方案:

def most_populated(population, single=True):
    years = dict()
    for person in population:
        for year in xrange(person[0], person[1]):
            if year in years:
                years[year] += 1
            else:
                years[year] = 0
    return max(years, key=years.get) if single else \
           [key for key, val in years.iteritems() if val == max(years.values())]

print most_populated([(1920, 1939), (1911, 1944),
                      (1920, 1955), (1938, 1939)])
print most_populated([(1920, 1939), (1911, 1944),
                      (1920, 1955), (1938, 1939), (1937, 1940)], False)

我正试图找到一种更有效的方法来解决Python中的这个问题。两者 - readabilityefficiency都很重要。此外,由于某些原因,我的代码不应该打印[1938, 1939]

更新

输入是元组的list,其中元组的第一个元素是人出生时的year,而tuple的第二个元素是死亡年份。

更新2

结束一年(元组的第二部分)和一个人活着的一年(所以如果这个人在Sept 1939去世(我们不关心这个月),他实际上活着1939年,至少部分内容)。这应该可以解决1939年的结果缺失问题。

最佳解决方案?

虽然可读性有利于@joran-beasley,但对于更大的输入,最有效的算法由@njzk2提供。感谢@ hannes-ovrén在IPython notebook on Gist

中提供分析

9 个答案:

答案 0 :(得分:5)

我刚才的另一个解决方案:

  • 创建2个表,birthdatesdeathdates
  • 在这些表格中累积出生日期和死亡日期。
  • 浏览这些表格以累积当时活着的人数。

总复杂度为O(n)

实施

from collections import Counter

def most_populated(population, single=True):
    birth = map(lambda x: x[0], population)
    death = map(lambda x: x[1] + 1, population)
    b = Counter(birth)
    d = Counter(death)
    alive = 0
    years = {}
    for year in range(min(birth), max(death) + 1):
        alive = alive + b[year] - d[year]
        years[year] = alive
    return max(years, key=years.get) if single else \
           [key for key, val in years.iteritems() if val == max(years.values())]

更好

from collections import Counter
from itertools import accumulate
import operator

def most_populated(population, single=True):
    delta = Counter(x[0] for x in population)
    delta.subtract(Counter(x[1]+1 for x in population))
    start, end = min(delta.keys()), max(delta.keys())
    years = list(accumulate(delta[year] for year in range(start, end)))
    return max(enumerate(years), key=operator.itemgetter(1))[0] + start if single else \
           [i + start for i, val in enumerate(years) if val == max(years)]

答案 1 :(得分:3)

>>> from collections import Counter
>>> from itertools import chain
>>> def most_pop(pop):
...     pop_flat = chain.from_iterable(range(i,j+1) for i,j in pop)
...     return Counter(pop_flat).most_common()
...
>>> most_pop([(1920, 1939), (1911, 1944), (1920, 1955), (1938, 1939)])[0]

答案 2 :(得分:3)

我会这样:

  • 按出生年份(unborn列表)
  • 对人员进行排序
  • 从第一个出生开始
    • 将此人列入alive列表
    • 使用死亡日期的插入排序(列表保持排序,因此使用二进制搜索)
    • 直到你找到那个未出生的人
  • 然后,从首先死亡的alive列表中的人开始,将其从列表中删除。
  • alive列表的大小放入dict
  • 增加年份
  • 循环,直到unbornalive列表为空

复杂性应该在O((m + n) * log(m))左右(每年只考虑一次,每个人只有两次,乘以alive列表中的插入费用)

实施

from bisect import insort

def most_populated(population, single=True):
    years = dict()
    unborn = sorted(population, key=lambda x: -x[0])
    alive = []
    dead = []
    for year in range(unborn[-1][0], max(population, key=lambda x: x[1])[1] + 1):
        while unborn and unborn[-1][0] == year:
            insort(alive, -unborn.pop()[1])
        while alive and alive[-1] == -(year - 1):
            dead.append(-alive.pop())
        years[year] = len(alive)
    return max(years, key=years.get) if single else \
           [key for key, val in years.iteritems() if val == max(years.values())]

答案 3 :(得分:2)

我们也可以使用numpy切片,它非常整洁,也应该非常高效:

import numpy as np
from collections import namedtuple

Person = namedtuple('Person', ('birth', 'death'))
people = [Person(1900,2000), Person(1950,1960), Person(1955, 1959)]

START_YEAR = 1900
END_YEAR = 2000
people_alive = np.zeros(END_YEAR - START_YEAR + 1) # Alive each year

for p in people:
    a = p.birth - START_YEAR
    b = p.death - START_YEAR + 1 # include year of death
    people_alive[a:b] += 1

# Find indexes of maximum aliveness and convert to year
most_alive = np.flatnonzero(people_alive == people_alive.max()) + START_YEAR

编辑似乎namedtuple增加了一些开销,所以要加快一点,删除namedtuple并做 而是for birth, death in people:

答案 4 :(得分:1)

  1. 只需将出生和死亡的岁月定为命令即可。如果是出生的,则将值增加1.,反之亦然。
  2. 通过键对字典进行排序,并通过读取当前在世人数来进行迭代。
  3. 在“ maxAlive”之后跟随“ theYear”,以获取第一年的最高数字

    years = {} 
    
    for p in people:
        if p.birth in years:
            years[p.birth] += 1
        else:
            years[p.birth] = 1
    
        if p.death in years:
            years[p.death] -= 1
        else:
            years[p.death] = -1
    
    alive = 0
    maxAlive = 0
    theYear = people[0].birth
    for year in sorted(years):
        alive += years[year]
        if alive > maxAlive:
            maxAlive = alive
            theYear = year
    

答案 5 :(得分:1)

这是我的解决方案,无需导入任何内容,也无需使用类来提高可读性。让我知道你的想法!我还为getMaxBirthYear制作了一个单独的函数,以防您在面试时有人要您将其编码而不是使用内置函数(我用过它们:))

class Person:
  def __init__(self, birth=None, death=None):
    self.birth=birth
    self.death=death

def getPopulationPeak(people):
  maxBirthYear = getMaxBirthYear(people)
  deltas = getDeltas(people, maxBirthYear)
  currentSum = 0 
  maxSum = 0
  maxYear = 0
  for year in sorted(deltas.keys()):
    currentSum += deltas[year]
    if currentSum > maxSum:
      maxSum = currentSum
      maxYear = year
  return maxYear, maxSum

def getMaxBirthYear(people):
  return max(people, key=lambda x: x.birth).birth

def getDeltas(people, maxBirthYear):
  deltas = dict()
  for person in people:
    if person.birth in deltas.keys():
      deltas[person.birth] += 1
    else:
      deltas[person.birth] = 1
    if person.death + 1 in deltas.keys():
      deltas[person.death + 1] -= 1
    elif person.death + 1 not in deltas.keys() and person.death <= maxBirthYear: # We can skip deaths after the last birth year
      deltas[person.death + 1] = -1
  return deltas

testPeople = [
  Person(1750,1802),
  Person(2000,2010),
  Person(1645,1760),
  Person(1985,2002),
  Person(2000,2050),
  Person(2005,2080),
]

print(getPopulationPeak(testPeople))

答案 6 :(得分:0)

这个怎么样:

def max_pop(pop):
    p = 0; max = (0,0)
    for y,i in sorted(chain.from_iterable([((b,1), (d+1,-1)) for b,d in pop])):
        p += i
        if p > max[1]: max=(y,p)
    return max

它不受年度影响,但在| pop |中是nlogn (除非你推出一个基数排序,一千年跨度为~10n,对于| pop |&gt; 1000应该更快)。不能兼得。一个非常通用的解决方案必须首先扫描,并根据测量的年度跨度和| pop |来决定使用哪个算法。

答案 7 :(得分:0)

我的回答

import java.util.LinkedHashMap;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;

public class AlogrimVarsta {

    public static void main(String args[]) {

        int startYear = 1890;
        int stopYear = 2000;

        List<Person> listPerson = new LinkedList<>();

        listPerson.add(new Person(1910, 1940));
        listPerson.add(new Person(1920, 1935));
        listPerson.add(new Person(1900, 1950));
        listPerson.add(new Person(1890, 1920));
        listPerson.add(new Person(1890, 2000));
        listPerson.add(new Person(1945, 2000));

        Map<Integer, Integer> mapPersoaneCareAuTrait = new LinkedHashMap<>();

        for (int x = startYear; x <= stopYear; x++) {
            mapPersoaneCareAuTrait.put(x, 0);
        }

        for (int x = startYear; x <= stopYear; x++) {
            for (Person per : listPerson) {

                int value = mapPersoaneCareAuTrait.get(x);

                if (per.getBorn() == x) {
                    mapPersoaneCareAuTrait.put(x, value + 1);
                    continue;
                }

                if (per.getDie() == x) {
                    mapPersoaneCareAuTrait.put(x, value + 1);
                    continue;
                }

                if ((per.getDie() - per.getBorn() > per.getDie() - x) && (per.getDie() - x > 0)) {
                    mapPersoaneCareAuTrait.put(x, value + 1);
                    continue;
                }

            }
        }

        for (Map.Entry<Integer, Integer> mapEntry : mapPersoaneCareAuTrait.entrySet()) {
            System.out.println("an " + mapEntry.getKey() + " numar " + mapEntry.getValue());            
        }

    }

    static class Person {
        final private int born;
        final private int die;

        public Person(int pBorn, int pDie) {
            die = pDie;
            born = pBorn;
        }

        public int getBorn() {
            return born;
        }

        public int getDie() {
            return die;
        }
    }
}

答案 8 :(得分:-2)

我找到了以下代码,这正是您所需要的。

让我们说年份的范围是1900 - 2000

算法的步骤

  1. 构造一个100个整数的数组X(全部初始化为零;如果包含2000年,则为101个整数)。
  2. 对于N个人中的每个人,将X [出生年份 - 1900]增加1并将X [死亡年份 - 1900]减1。
  3. 通过X迭代,保持每个元素的总和。人口最多的一年是1900加上总和最大的指数。
  4. 代码(Python请求)

    def year_with_max_population(people):
    population_changes = [0 for _ in xrange(1900, 2000)]
    for person in people:
        population_changes[person.birth_year - 1900] += 1
        population_changes[person.death_year - 1900] -= 1
    max_population = 0
    max_population_index = 0
    population = 0
    for index, population_change in enumerate(population_changes):
        population += population_change
        if population > max_population:
            max_population = population
            max_population_index = index
    return 1900 + max_population_index
    

    信誉'Brian Schmitz'here