如何编写流水线矩阵乘数。 4x5 x 5x3得到4x3

时间:2019-04-29 07:53:16

标签: vhdl

请考虑以下矩阵乘数,其中输出C(4x3)是2个输入矩阵A(4x5)和B(5x3)的乘积。

对矩阵乘法进行精细流水线处理,以便在每个循环中产生乘积aij * bjk并将其添加到位置P(i,k)的部分乘积cik中。每5个时钟周期产生一次完整的cik。假设aij,bjk和cik项都是32位宽的整数。

1)写入(a)P(i,k)块的VHDL代码,(b)从RAM读取A,B并将C写入其中的RAM –该RAM具有2个读取端口和1个写入端口, c)FIFO(延迟aij和bjk项的应用,并延迟cik项写入RAM)。

2)为上图所示的P块的连接排列编写顶层VHDL代码,并在正确的时序下为aij,bjk和cik项的应用插入适当的FIFO。保持矩阵A,B和C的RAM也应该是此顶层设计的一部分。

3)为此编写一个测试平台。

这是我所拥有的: mat_ply.vhd

library IEEE;
use IEEE.STD_LOGIC_1164.all;
use ieee.numeric_std.all;

package mat_ply is

    type t11 is array (0 to 4) of unsigned(15 downto 0);
    type t1 is array (0 to 3) of t11; --4*5 matrix
    type t22 is array (0 to 2) of unsigned(15 downto 0);
    type t2 is array (0 to 4) of t22; --5*3 matrix
    type t33 is array (0 to 2) of unsigned(31 downto 0);
    type t3 is array (0 to 3) of t33; --4*3 matrix as output

    function matmul ( a : t1; b:t2 ) return t3;

end mat_ply;

package body mat_ply is

    function matmul ( a : t1; b:t2 ) return t3 is
    variable i,j,k : integer:=0;
    variable prod : t3:=(others => (others => (others => '0')));

begin
    for i in 0 to 3 loop --(number of rows in the first matrix - 1)
        for j in 0 to 2 loop --(number of columns in the second matrix - 1)
            for k in 0 to 4 loop --(number of rows in the second matrix - 1)

                prod(i)(j) := prod(i)(j) + (a(i)(k) * b(k)(j));

            end loop;
        end loop;
    end loop;
return prod;

end matmul;

end mat_ply;

和TB

LIBRARY ieee;
USE ieee.std_logic_1164.ALL;
USE ieee.numeric_std.ALL;
library work;
use work.mat_ply.all;

ENTITY mat_tb IS
END mat_tb;

ARCHITECTURE behavior OF mat_tb IS
--signals declared and initialized to zero.
    signal clk : std_logic := '0';
    signal a : t1:=(others => (others => (others => '0')));
    signal b : t2:=(others => (others => (others => '0')));
    signal x: unsigned(15 downto 0):=(others => '0'); --temporary variable
    signal prod : t3:=(others => (others => (others => '0')));
-- Clock period definitions
constant clk_period : time := 1 ns;

BEGIN
-- Instantiate the Unit Under Test (UUT)
    uut: entity work.test_mat PORT MAP (clk,a,b,prod);

-- Clock process definitions
    clk_process :process
        begin
            clk <= '0';
        wait for clk_period/2;
            clk <= '1';
        wait for clk_period/2;
    end process;

-- Stimulus process
    stim_proc: process
        begin
--first set of inputs..
            a <= ((x,x+1,x+2,x+3,x+4),(x+2,x,x+1,x,x),(x+1,x+5,x,x,x),(x+1,x+1,x,x,x));
            b <= ((x,x+1,x+4),(x,x+1,x+3),(x,x+2,x+3),(x,x+1,x+3),(x,x+1,x+3));
        wait for 2 ns;
--second set of inputs can be given here and so on.
    end process;

END;

我没有收到任何错误,但是我不知道我的代码是否正确。

0 个答案:

没有答案