为什么我的深度学习应用程序需要这么多 RAM?

时间:2021-02-07 00:56:13

标签: python memory deep-learning

我是深度学习的新手。我在 Coursera 上学习了深度学习开发。我下载了应用程序来运行我自己的数据集。事实证明需要更多的内存。我在具有 25GB RAM 的 Google Colab 上尝试过,但仍然无法正常工作。

另一个具有 200 个64 * 64 像素图像的数据集可以完美运行。我的数据集有 500 个 800 * 800 个图像,并且由于 RAM 不足而崩溃。我认为这就是问题所在。现在我有两个问题:

  1. 如何优化此应用程序以训练我的数据集?
  2. 是否有机会使用更好的 NN 模型来解决 RAM 问题?
    # GRADED FUNCTION: two_layer_model
    def two_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):
        Implements a two-layer neural network: LINEAR->RELU->LINEAR->SIGMOID.
        X -- input data, of shape (n_x, number of examples)
        Y -- true "label" vector (containing 1 if cat, 0 if non-cat), of shape (1, number of examples)
        layers_dims -- dimensions of the layers (n_x, n_h, n_y)
        num_iterations -- number of iterations of the optimization loop
        learning_rate -- learning rate of the gradient descent update rule
        print_cost -- If set to True, this will print the cost every 100 iterations 
        parameters -- a dictionary containing W1, W2, b1, and b2
        grads = {}
        costs = []                              # to keep track of the cost
        m = X.shape[1]                           # number of examples
        (n_x, n_h, n_y) = layers_dims
        # Initialize parameters dictionary, by calling one of the functions you'd previously implemented
        ### START CODE HERE ### (≈ 1 line of code)
        parameters = initialize_parameters(n_x, n_h, n_y)
        ### END CODE HERE ###
        # Get W1, b1, W2 and b2 from the dictionary parameters.
        W1 = parameters["W1"]
        b1 = parameters["b1"]
        W2 = parameters["W2"]
        b2 = parameters["b2"]
        # Loop (gradient descent)
        for i in range(0, num_iterations):
            # Forward propagation: LINEAR -> RELU -> LINEAR -> SIGMOID. Inputs: "X, W1, b1, W2, b2". Output: "A1, cache1, A2, cache2".
            ### START CODE HERE ### (≈ 2 lines of code)
            A1, cache1 = linear_activation_forward(X, W1, b1, activation='relu')
            A2, cache2 = linear_activation_forward(A1, W2, b2, activation='sigmoid')
            ### END CODE HERE ###
            # Compute cost
            ### START CODE HERE ### (≈ 1 line of code)
            cost = compute_cost(A2, Y)
            ### END CODE HERE ###
            # Initializing backward propagation
            dA2 = - (np.divide(Y, A2) - np.divide(1 - Y, 1 - A2))
            # Backward propagation. Inputs: "dA2, cache2, cache1". Outputs: "dA1, dW2, db2; also dA0 (not used), dW1, db1".
            ### START CODE HERE ### (≈ 2 lines of code)
            dA1, dW2, db2 =  linear_activation_backward(dA2, cache2, activation='sigmoid')
            dA0, dW1, db1 =  linear_activation_backward(dA1, cache1, activation='relu')
            ### END CODE HERE ###
            # Set grads['dWl'] to dW1, grads['db1'] to db1, grads['dW2'] to dW2, grads['db2'] to db2
            grads['dW1'] = dW1
            grads['db1'] = db1
            grads['dW2'] = dW2
            grads['db2'] = db2
            # Update parameters.
            ### START CODE HERE ### (approx. 1 line of code)
            parameters = update_parameters(parameters, grads, learning_rate)
            ### END CODE HERE ###
            # Retrieve W1, b1, W2, b2 from parameters
            W1 = parameters["W1"]
            b1 = parameters["b1"]
            W2 = parameters["W2"]
            b2 = parameters["b2"]
            # Print the cost every 100 training example
            if print_cost and i % 100 == 0:
                print("Cost after iteration {}: {}".format(i, np.squeeze(cost)))
            if print_cost and i % 100 == 0:
        # plot the cost
        plt.xlabel('iterations (per tens)')
        plt.title("Learning rate =" + str(learning_rate))
        return parameters

这个方法是训练我的数据集的主要方法。 运行此语句后:

parameters = two_layer_model(train_x, train_y, layers_dims = (n_x, n_h, n_y), num_iterations = 2500, print_cost=True)


Cost after iteration 100: 0.6464320953428849
Cost after iteration 200: 0.6325140647912678
Cost after iteration 300: 0.6015024920354665
Cost after iteration 400: 0.5601966311605747
Cost after iteration 500: 0.5158304772764729
Cost after iteration 600: 0.4754901313943325
Cost after iteration 700: 0.43391631512257495
Cost after iteration 800: 0.4007977536203887
Cost after iteration 900: 0.35807050113237976
Cost after iteration 1000: 0.33942815383664127
Cost after iteration 1100: 0.30527536361962654
Cost after iteration 1200: 0.2749137728213016
Cost after iteration 1300: 0.24681768210614846
Cost after iteration 1400: 0.19850735037466097
Cost after iteration 1500: 0.17448318112556657
Cost after iteration 1600: 0.1708076297809689
Cost after iteration 1700: 0.11306524562164715
Cost after iteration 1800: 0.09629426845937145
Cost after iteration 1900: 0.08342617959726863
Cost after iteration 2000: 0.07439078704319078
Cost after iteration 2100: 0.06630748132267933
Cost after iteration 2200: 0.0591932950103817
Cost after iteration 2300: 0.05336140348560554
Cost after iteration 2400: 0.04855478562877016

但是我的应用程序在第 100 次迭代后计算第一行成本后崩溃了,这是来自 Colab 的应用程序日志:

时间戳 级别 留言
2021 年 2 月 6 日下午 6:31:31 警告 警告:root:kernel 88a81184-be18-4889-9d35-6fccdf6f1aa6 重新启动
2021 年 2 月 6 日下午 6:31:31 信息 KernelRestarter:重启内核(1/5),保留随机端口
2021 年 2 月 6 日下午 6:31:25 警告 tcmalloc:大的alloc 7925760000字节== 0x5f3bd8000 @ 0x7fd3aaaa81e7 0x7fd3a1fa741e 0x7fd3a1ff7c2b 0x7fd3a1ff8240 0x7fd3a1ff039a 0x7fd3a22170a5 0x7fd3a2095928 0x7fd3a209985d 0x566f73 0x59fd0e 0x7fd3a1fe4ea7 0x50a12f 0x50beb4 0x507be4 0x509900 0x50a2fd 0x50beb4 0x5095c8 0x50a2fd 0x50beb4 0x507be4 0x509900 0x50a2fd 0x50cc96 0x507be4 0x509900 0x50a2fd 0x50cc96 0x507be4 0x5161c5 0x50a12f

可以找到整个应用程序代码 here


0 个答案:
