(想要下面的Matlab代码的一种有效的替代方法。)我想从下面提到的Matlab代码中获取大小为(N-60 * 60)的矩阵“数据”。对于for循环,并且N具有非常大的值,这会花费大量的计算时间。有人可以推荐更快的方法来获取数据矩阵。
import random
line_we_want = random.randrange(5)
with open('keywords.txt', 'r') as input_file:
content = input_file.readlines()
for number_of_line, line in enumerate(content):
if number_of_line == line_we_want:
print(line)
谢谢!
答案 0 :(得分:0)
正如Sardar所说,预分配数据将为您带来最大的改善。但是,您也可以使用聪明的索引删除for循环。这些评论应该可以解释我的大部分工作。
n = 1e6;
% Modified a to be a incrementing list to better understand how data is
% constructed
a = (1:n)';
order = 60;
%% Original code with pre allocation added
data = zeros(n-order+1, order);
for i = order:length(a)
data(i-order+1,:) = a([i:-1:i-order+1])';
end
%% Vectorized code
% The above code was vectorized by building a large array to index into
% the a vector with.
% Get the indicies of a going down the first column of data
% Went down the column instead of across the row to avoid a transpose
% after the reshape function
idx = uint32(order:n);
% As we go across the columns we use the same indexing but -1 as we move to
% the right so create the simple offset using 0:-1:1-order. Then expand to
% the correct number of elements using kron
offset = kron(uint32(0:-1:1-order), ones(1, n-order+1, 'uint32'));
% Replicate the column indexing for how many columns we have and add the
% offset
fullIdx = repmat(idx, 1, order) + offset;
% Then use the large indexing array to get all the data as a large vector
% and then reshape it to the matrix
data2 = reshape(a(fullIdx), n-order+1, order);
% idx, offset, and fullIdx will take up a fair amount of memory so it is
% probably best to clear them
clear idx offset fullIdx;
assert(isequal(data, data2));
注意:并非必须使用uint32,但是确实可以节省内存使用量,并且对性能的改善不大。