Question

我实际上正在解决在实际采访之前提交给我的小编程测试。

我必须实际删除有关问题的信息，您可以在下面的链接中轻松找到它。

所以我尝试了几种直观的方法，或多或少的成功。在一些研究中，我在GIT（https://github.com/miracode/Machine-Works）上找到了一个例子，其中正在使用一些节点。我决定将它实现到我的脚本中来测试它。事实证明它比我的快得多，但仍然无法处理整个输入集。这是一个25MB的txt文件，有54种不同的案例，其中一些每个TestCases有10 000多台机器。我在其他GIT解决方案中找到了同样的解决方案（也只有这一个）。

因此，当我运行自己的脚本时，我可以理解它会在结束大输入测试之前崩溃我的PC，但是从GIT获取解决方案并且无法计算测试输入是非常令人惊讶的。

我的计算机上有16GB的RAM，我从未看到它像那样崩溃，即使在处理更大的数据集时也是如此。

以下是我的解决方案实施的副本：

from load_input2 import load as load
import time
"""Third version of project """
""" Implementing decision object, inspired from GIT-found script """

PATH = 'input2.txt'


class TestCase(object):
    def __init__(self, C, D, machines=[]):
        self.budget = C
        self.days = D
        self.machines = sorted([Machine(i[0], i[1], i[2], i[3])
                         for i in machines], key = lambda x : x.day)

    def run(self):
        choice = Decision()
        (choice.machine, choice.budget, choice.day) = (None, self.budget, 0)

        choices = [choice, ]

        for machine in self.machines:

            next_choice = []
            for choice in choices:
                choice.to_buy, choice.not_buy = Decision(), Decision()
                choice.to_buy.day, choice.not_buy.day = machine.day, machine.day
                potential_budget = choice.budget + choice.machine.p_sell + choice.machine.daily_profit * \
                    (machine.day - choice.day -
                     1) if choice.machine else choice.budget

                if machine.p_buy <= potential_budget:

                    choice.to_buy.budget = potential_budget - machine.p_buy
                    choice.to_buy.machine = machine
                    next_choice.append(choice.to_buy)

                choice.not_buy.machine = choice.machine

                try:
                    choice.not_buy.budget = choice.budget + \
                        choice.machine.daily_profit * \
                        (machine.day - choice.day)
                except AttributeError:
                    choice.not_buy.budget = choice.budget
                next_choice.append(choice.not_buy)

            choices = next_choice


        results = []
        for choice in choices:
            try:
                results.append(choice.budget +
                               choice.machine.daily_profit *
                               (self.days -
                                choice.day) +
                               choice.machine.p_sell)
            except AttributeError:
                results.append(choice.budget)
        return(max(results))


class Machine(object):
    def __init__(self, day, p_buy, p_sell, daily_profit):
        self.p_buy, self.p_sell = p_buy, p_sell
        self.day, self.daily_profit = day, daily_profit


class Decision(object):
    def __init__(self):
        self.to_buy, self.not_buy = None, None
        self.machine, self.budget = None, None
        self.day = None


def main():
    start = time.time()
    global PATH
    testcases = load(PATH)
    count = 1
    for (case_data, data) in testcases:
        machines = [i for i in data]
        dolls = TestCase(case_data[2], case_data[3], machines).run()
        print(
            "Case {}: {}".format(case_data[0], dolls))
    print("Effectue en  ", start - time.time())


if __name__ == '__main__':
    main()

Load_input2.py：

def load(path):
    with open(path) as fil:
        inp = fil.read().split('\n')  # Opening the input file
    testcases, results = {}, {}
    count = 1
    for line in inp:  # Splitting it and getting results for each TestCase
        split = [int(i) for i in line.split()]
        if len(split) == 3:
            case = tuple([count]+split)
            testcases[case] = []
            count+=1
        else:
            if len(split) > 0:
                testcases[case].append(split)
    sort = sorted([(case,data) for case,data in testcases.items()] , key = lambda x : x[0][0])
    #print([i[0] for i in sort])
    return(sort)

如果您有任何建议或暗示，我会帮助他们！

我真的不想要一个准备好的粘贴解决方案，因为这是一个面试问题，我希望它真诚地反映我的能力，即使我确实在我的能力中包括在惊人的社区中进行搜索;）< / p>

感谢关心！

编辑：整个输入测试集在此处可用：https://gitlab.com/InfoCode/Coding_Problems/raw/master/MachineWork/input.txt

编辑：我使用的原始脚本，当然不是非最佳的，但计算量要少得多，我相信真正的大型测试用例过程是不同的，在开头解释

""" First version of the project"""
""" Using a day-to-day approach to estimate best behavior"""
""" On each day, this algorithm will complete :"""
""" - Looking for each machine to be bought on this day and taking the more profitable one in long-term run"""
""" - During all depreciation period (time required for the machine to be cost-effective), checking if the purchase of the machine won't interfer with some more profitable machine"""
""" - Buying the machine and moving along to next day"""
""" This algorithm allow a faster execution for input with large sets of machines to be sold"""

""" Cannot yet found how to prevent him from choosing the machine 2 in case (6,10,20) which leads to a decrease of 1 dollar in profits"""

PATH = 'input2.txt'

# Defining the TestCase class which is used for iterating through the days


class TestCase(object):
    def __init__(self, C, D, machines=[]):
        self.budget = C
        self.days = D
        self.machines = [Machine(self, i[0], i[1], i[2], i[3])
                         for i in machines]
        self.choices = []

    # Main function for running the iteration through the days
    def run_case(self):
        for i in range(1, self.days + 1):
            best = self.best_machine_on_day(i)
            if (best is not None and self.should_buy(best[0], i)):
                self.choices.append(best)
        if len(self.choices) > 0:
            self.choices[-1][0].buy_sell(self, self.days + 1, sell=True)
        return(self.budget)

    # Function to define the best machine on a specific day
    def best_machine_on_day(self, n):
        results = []
        for machine in self.machines:
            if n == machine.day:
                results.append(machine.day_based_potential(self, n))
        if len(results) == 0:
            return(None)
        elif len(results) == 1:
            return(results[0])
        else:
            return(max(results, key=lambda x: x[2] * (self.days - n) - x[1]))

    # To define rather an individual should buy or not a machine having a
    # small look on the day aheads
    def should_buy(self, machine, n):
        potential_budget = self.budget + self.choices[-1][0].p_sell + self.choices[-1][0].daily_profit * (
            n - self.choices[-1][0].day - 1) if len(self.choices) > 0 else self.budget
        day_to_cover_cost = int(
            machine.cost / machine.daily_profit) if machine.cost % machine.daily_profit != 0 else machine.cost / machine.daily_profit - 1
        for day in range(day_to_cover_cost):
            next_day = self.best_machine_on_day(n + day + 1)
            if next_day is not None:
                day_to_buy = next_day[0].day
                if (
                    machine.earnings_from_day(
                        self, day_to_buy) < next_day[0].earnings_from_day(
                        self, day_to_buy) or machine.cost >= machine.daily_profit * (
                        next_day[0].day - machine.day)) and next_day[0].p_buy <= potential_budget:
                    return(False)
        if (potential_budget >= machine.p_buy and machine.earnings_from_day(
                self, n) >= machine.p_buy):
            if len(self.choices) > 0:
                self.choices[-1][0].buy_sell(self, n, sell=True)
            machine.buy_sell(self, n)
            return(True)
        else:
            return(False)

# Defining the machine object


class Machine(object):
    def __init__(self, case, day, p_buy, p_sell, daily_profit):
        self.cost = p_buy - p_sell
        self.p_buy, self.p_sell = p_buy, p_sell
        self.day = day
        self.daily_profit = daily_profit

    # To compute the earnings from a starting day n to the end
    def earnings_from_day(self, case, n):
        if self.day <= n <= case.days:
            return((case.days - n) * self.daily_profit - self.cost)
        else:
            return(0)
    # Represent itself method

    def day_based_potential(self, case, n):
        return((self, self.cost, self.daily_profit))
    # Actions on Budget

    def buy_sell(self, case, n, sell=False):
        if sell:
            case.budget += self.p_sell + self.daily_profit * (n - self.day - 1)
        else:
            case.budget -= self.p_buy


def main():
    global PATH
    testcases = load(PATH)
    count = 1
    for case_data, data in testcases.items():
        machines = [i for i in data]
        dolls = TestCase(case_data[1], case_data[2], machines).run_case()
        print(
            "Case {}: {}".format(count, dolls))
        count += 1


if __name__ == '__main__':
    main()

Answer 1

更新：解决方案

我发现这个问题起源于2011年ACM-ICPC世界总决赛（acm国际大学生程序设计竞赛; https://icpc.baylor.edu/worldfinals/problems，问题F）。他们还提供了正确的测试结果。

http://www.csc.kth.se/~austrin/icpc/finals2011solutions.pdf

在我的方法中，我采用了两步法：

某些预处理适用于一个测试用例中的所有可用机器。在给定所有现有机器的上限启发式的情况下，预处理过度估计每台机器的可承受性。永远不会负担得起的机器会从机组中删除。
搜索本身遵循从后到前的递归方案。它首先确定最理想的机器（从可用当天到期末产生最高利润的机器）并遵循DFS（深度优先搜索）以找到使用经济实惠的机器到初始预算的路径。由于机器每一步都要重新评估，我们可以在找到解决方案后立即考虑最佳解决方案。

一旦我在所有测试用例中得出正确的结果，我可以在此发布我的解决方案。

原始答案

对于你的任务：似乎被打破，即它不是完全可计算的。您可能需要通过预期计划（以及n天的预先计划窗口）进行定向搜索的启发式方法，以便有效地接近解决方案。

关于读取整个文件，在保持文件句柄打开的同时使用生成器表达式怎么样？像这样：

def as_int_list(line):
    return [int(i) for i in line.strip().split()]


def read_test_case(filehandle):
    n, c, d = tuple(as_int_list(fh.readline()))
    m = []
    while len(m) < n:
        m.append(as_int_list(fh.readline()))
    yield (n, c, d, m)


if __name__ == '__main__':
    localfile = 'testcases.txt'

    no = 0
    with open(localfile, 'r') as fh:
        while no < 5:
            case = read_test_case(fh).next()
            print(case)
            no += 1

请注意，我将要读取的测试用例数量限制为5，但您可以阅读EOFError或StopIteration（尚未对整个文件进行测试，但是你肯定会发现。）

机器学习面试编程测试

1 个答案:

更新：解决方案

原始答案