快速线性列表,不包括重复

时间:2011-06-14 21:49:13

标签: delphi list optimization language-agnostic conways-game-of-life

我有以下代码:

procedure TCellBlock.GeneratePtoQ;
var
  x,y: integer;
  i: integer;
  Change: cardinal;
begin
  i:= 0;
  //Walk the grid of changed (alive) cells 
  for x:= GridMaxX downto 1 do begin
    for y:= GridMaxY downto 1 do begin
      if Active[cIndexP][x, y] then begin
        Active[cIndexP][x,y]:= false;

        //Put active items on the stack.
        ToDo[i]:= x shl 16 or y;
        Inc(i);
      end; {if}
    end; {for y}
  end; {for x}
  while i > 0 do begin
    Dec(i);
    y:= ToDo[i] and $FFFF;
    x:= ToDo[i] shr 16;

    //Calculate the cell, Change = (oldval XOR newval)
    Change:= Grid[x,y].GeneratePtoQ;

    //Mark the cells in the grid that need to be recalculated next generation.
    Active[cIndexQ][x,y]:= Active[cIndexQ][x,y] or (Change <> 0);
    Active[cIndexQ][x+1,y+1]:= Active[cIndexQ][x+1,y+1] or ((Change and $cc000000) <> 0);
    Active[cIndexQ][x+1,y]:= Active[cIndexQ][x+1,y] or ((Change and $ff000000) <> 0);
    Active[cIndexQ][x,y+1]:= Active[cIndexQ][x,y+1] or ((Change and $cccccccc) <> 0);
  end; {while}
end;

以上是计算conway game of life的测试程序的代码片段 代码需要尽可能快。为此我正在尝试不同的方法。

它走过一个活跃的细胞网格,看看哪些细胞是活跃的并放置它们 在堆栈上。
接下来,它处理堆栈中的项目并查看哪些单元格已更改 如果单元格已更改,则会将更改更新到网格中以供下一代使用。

我将单元格存储在32位基数(4位Y,8位X)中,而P(偶数)代相对于Q(奇数)代数偏移1,1个像素,这样我只需要将3个邻居带入帐户而不是8。

问题
我想摆脱网格,我只想处理堆栈 如何实现消除重复的堆栈?

请注意,它需要尽快 ,并且我不会使用肮脏的技巧来获得它。

5 个答案:

答案 0 :(得分:1)

如果我理解你的要求,你希望堆栈没有重复值。我不是德尔福人,但如果是java,我会创建一个hashmap / map树并将每个值添加到地图中,然后在将它添加到堆栈之前检查它是否已经在哈希中。你也可以添加哈希迭代它的所有值,但是你将松开哈希的顺序。

答案 1 :(得分:0)

我个人采取完全不同的方法。首先,我不知道你是如何因为使用1,1偏移而不必考虑所有邻居然后我怀疑比特移位技巧使算法更快(通常足够相反,但它可以是内存带宽受限,在这种情况下我们会赢得一点)

所以我只想做一件应该带来最大性能提升的事情:使算法成为多线程。在我们的Quad / Hex / Octacores世界中,担心性能提升几个百分点,而浪费300%或更多,似乎很愚蠢。因此,如果我们忽略活动网格并检查所有字段,算法将是微不足道的,具有一些很好的缩放,特别是因为可以很容易地对算法进行矢量化,但是那不是特别有效,所以我尝试了一些不同的多线程方法只考虑活动单元格的算法。

首先不是摆脱网格,而是将它加倍:一个src和一个dest网格 - 每一轮交换。没有锁定来访问必要的网格,不必担心更新字段时没有陈旧条目(对多线程很重要,我们想要使用缓存)。

现在最简单的解决方案是对活动单元使用某种并发列表结构(不知道delphi库),让每个线程从中窃取并将新的活动单元添加到另一个。通过并发队列的良好无锁实现(基本上无论在{4}中是否替换{4}}或类似的东西都可以非常简单。为了获得更好的性能而不是将单个节点添加到列表中,我考虑将整个块添加到列表中,比如大小为10左右 - 更多的工作以更少的开销,但如果我们使块太大,我们就会失去并行性。

我可以想到其他解决方案,例如为每个线程提供一个活动单元列表(或者更确切地说是一个列表用于所有和不同的偏移)但是我们必须在每次运行之间收集所有新条目(没有太多同步开销)但有些复制到列表中 - 值得一试。我猜想。

答案 2 :(得分:0)

如果您的目标是速度(并且只有速度)。有一些技巧可以加速很多事情。我自己实施的Conway的生命游戏使用这些技巧使其更快。请注意,内存非常昂贵。

  1. 每个单元格都是一个对象
  2. 每个单元格对象包含其X / Y坐标
  3. 每个单元格对象包含Alive邻居数量的“实时”计数器。 (当一个单元打开/关闭时,它会通知它的邻居,以便他们更新计数器。
  4. 要使#3有效,在计算下一代时,不会立即打开/关闭单元格。相反,它们会堆叠到一个列表中,直到计算出所有单元格。
  5. 每个单元格都有一个计数器,指示哪个是他们更改的最后一代。这避免了两次计算同一个细胞。 (我替代堆栈消除了重复)
  6. #5的列表会在下一代上重复使用,因为只有上一代更改的单元格的邻居才能在当前一代中更改。
  7. 我使用一些技巧来加速这一代人。这里列出的一些技巧将比多线程实现更多。但是使用那些和多线程将获得最大的性能。

    至于多线程主题,请阅读Voo的条目。

答案 3 :(得分:0)

我一直在考虑它,我想我有一个解决方案。

一些背景

以下是数据在内存中的布局方式

00 A  08  B  10   18     The bits of Individual int32's are layout like this:
01 |  09  |  11   19     00 04 08 0C 10 14 18 1C    // N-Mask: $33333333
02 |  0A  |  12   1A     01 05 09 0D 11 15 19 1D    // S-Mask: $cccccccc
03 |  0B  |  13   1B     02 06 0A 0E 12 16 1A 1E    // W-Mask: $000000ff
04 |  0C  |  14   1C     03 07 0B 0F 13 17 1B 1F    // E-Mask: $ff000000
05 |  0D  |  15   1D                                //SE-Mask: $cc000000
06 |  0E  |  16   1E                                //NW-Mask: $00000033
07 V  0F  V  17   1F     I can mask of different portions if need be.

-- Figure A: Grid --     -- Figure B: cell --       -- Table C: masks --

我还没有确定构建块的大小,但这是一般的想法。

甚至几代被称为P,奇数代被称为Q

他们像这样错开了

 +----------------+<<<<<<<< P         00 04  08  0C  //I use a 64K lookup
 |+---------------|+                  01 05* 09* 0D  //table to lookup
 ||               ||                  02 06* 0A* 0E  //the inner* 2x2 bits from
 ||               ||                  03 07  0B  0F  //a 4x4 grid.
 +----------------+|                  //I need to do 8 lookups for a 32 bit cell
  +----------------+<<<<<<<< Q

 - Figure D: Cells are staggered -   -- Figure E: lookup --

这种方式在生成P -> Q时,我只需要查看P本身及其S,SE,E邻居,而不是所有8个邻居,同样适用于Q - &gt; P.我只需要看看Q本身及其N,NW和W邻居 另外请注意,这种错开可以节省我翻译查找结果的时间,因为我必须做更少的位移以将结果放到位。

当我循环通过网格(图A)时,我按照图A中所示的顺序遍历单元格(图B)。始终严格按顺序递增一个P循环,在Q循环中始终按降序排列 事实上,Q循环的工作方式与P循环完全相反,这样可以通过尽可能多地重用缓存来加快速度。

我想尽量减少使用指针,因为指针无法预测,也无法按顺序访问(它们遍布整个地方)所以我想使用数组,堆栈和队列尽可能多。

需要跟踪哪些数据
我需要跟踪只有变化的细胞。如果一个单元格(即int32)没有从一代更改为下一代,我将其从考虑中删除 这就是问题中的代码所做的事情。它使用网格来跟踪变化,但我想使用堆栈,而不是网格;并且我想要处理我不想知道稳定或死细胞的活跃细胞。

数据的一些背景
注意细胞本身是如何始终单调增加的。正如它的S邻居,以及E和SE邻居一样。我可以用这个信息作弊。

解决方案
我使用堆栈来跟踪单元本身及其S邻居和队列以跟踪其E和SE邻居,当我完成时我将两者合并。

假设在网格中,以下单元格在我计算出来之后就会变为活动状态:

00, 01, 08 and 15
I make the following two stacks:

stack A    stack B
00         08      a)  -A: Cell 00 itself in stack A and its E-neighbor in B
01         09      a)      Cell 00's S neighbor in stack A and its SE-n'bor in B
02         0A      b)  -B: Cell 01 is already in the stack, we only add S/SE
08         10      c)  -C: Cell 08 goes into the stack as normal
09         11      c)      We'll sort out the merge later.
15         1D      d)  -D: Cell 15 and neighbors go on as usual.
16         1E      d)

Now I push members from stack A and B onto a new stack C so that stack C has
no duplicates and it strictly increasing:

Here's the pseudo code to process the two queues:  

a:= 0; b:= 0; c:=0;
while not done do begin
  if stack[a] <= stack[b] then begin
    stack[c]:= stack[a]; inc(a); inc(c);
    if stack[a] = stack[b] then inc(b);
  end
  else begin
    stack[c]:= stack[b]; inc(b); inc(c);
  end;
end; {while}

甚至更好
我不必实际做两个堆栈和合并作为两个单独的步骤,如果我使A成为堆栈并B一个队列,我可以执行伪代码中描述的第二步和一次通过建立两个堆栈。

注意
当一个单元改变它的S,E或SE边界不需要改变,但我可以使用表C中的掩码测试它,并且只将真正需要在下一代中检查的单元添加到列表中。

好处

  1. 使用这种方案,我只需要在计算单元格时使用活动单元格来走过一个堆栈,因此我不会浪费时间查看死区或非活动区域。
  2. 我只进行顺序内存访问,最大化缓存使用率。
  3. 使用下一代的新更改构建堆栈只需要一个额外的临时队列,我按严格的顺序处理。
  4. 我没有排序和最少的比较。
  5. 我不必跟踪每个单元格的邻居(int32),我只需要跟踪邻居(S,E,SE,N,W,NW)网格,这使内存开销降至最低。
  6. 我不需要跟踪细胞状态,我只需要计算死细胞(一个细胞死了,因为它已经死了,或者因为改变死了。所有活动单元都在我的TODO堆栈中,这节省了簿记时间和内存空间。
  7. 该算法在o(n)时间内运行,其中(n)是活跃细胞的数量,它排除死细胞,稳定细胞和在第2期振荡的细胞。
  8. 我只处理32位红雀,这比使用int16要快得多。

答案 4 :(得分:0)

主要是@Ken,测试程序的完整源代码:

请注意,99.9%的时间用于显示,因为我没有做任何事情 优化它。

我已经创建了一个新的SDI-main应用程序,并在其中发布了代码,因为我很懒,我没有打扰重命名或重新绘制任何控件。

项目文件:sdiapp.dpr

program Sdiapp;

uses
  Forms,
  SDIMAIN in 'SDIMAIN.pas'; {Form1}

{$R *.RES}

begin
  Application.Initialize;
  Application.CreateForm(TForm1, Form1);
  Application.Run;
end.

主要形式:sdimain.pas

unit SDIMAIN;

interface

uses Windows, Classes, Graphics, Forms, Controls, Menus,
  Dialogs, StdCtrls, Buttons, ExtCtrls, ComCtrls, ImgList, StdActns,
  ActnList, ToolWin;


  {--------------------------------------------
  p and q are bit arrays of 16x16 bits, grouped
  as in 8 int32's as follows

  P00 P04 P08 P0c P10 P14 P18 P1c 
  P01 P05 P09 P0d P11 P15 P19 P1d
  P02 P06 P0a P0e P12 P16 P1a P1e
  P03 P07 P0b P0f P13 P17 P1b P1f
  |
  +----> The bits per int32 are grouped as follows

  The int32's are grouped as follows
  P0 P1
  P2 P3
  P4 P5
  P6 P7

  P and Q are staggered as follows:

  +---------------------------------+ <---- P
  | +-------------------------------|-+ <----Q
  | |                               | |
  | |                               | |
  ...                               ...
  | |                               | |
  +-|-------------------------------+ |
    +---------------------------------+

  Generations start counting from 0,
  all even generations are stored in P.
  all odd generations are stored in Q.
  When generating P->Q, the S, SE and E neighbors are checked.
  When generating Q->P, the N, NW and W neighbors are checked.

  The westernmost P edge in a grid is stored inside that grid.
  Ditto for all easternmost Q edges.

  --------------------------------------------}

const
  cClearQState = $fffffff0;
  cClearPState = $fffff0ff;
  cIndexQ = 1;
  cIndexP = 0;

  ChangeSelf = 0;
  ChangeNW = 1;
  ChangeW = 2;
  ChangeN = 3;

  ChangeSE = 1;
  ChangeE = 2;
  ChangeS = 3;

const
  //A Grid is 128 x 128 pixels.
  GridSizeX = 512 div 8;   //should be 128/8, 1024 for testing.
  GridSizeY = GridSizeX * 2; //32 totaal: 16x32x4bytes =  2048 x 2 (p+q) = 4k per block.
  GridMaxX = GridSizeX - 1;
  GridMaxY = GridSizeY - 1;
  NumberOfCells = GridSizeX * GridSizeY;

  CellSizeX = 8;
  CellSizeY = 4;
  CellMaxX = CellSizeX - 1;
  CellMaxY = CellSizeY - 1;


type
  TUnit = Cardinal;
  TBytes = array[0..3] of byte;

  TChange = array[0..3] of boolean;

type
  TCellBlock = class;
  TFlags = record
    case boolean of
    true: (whole: cardinal);
    false: (part: array[0..3] of byte);
  end;

  //TActiveList = array[0..GridMaxX, 0..GridMaxY] of boolean;
  //TActive = array[0..1] of TActiveList;
  TToDoList = array[-1..NumberOfCells] of cardinal; //Padding on both sides.

  TNewRow = TFlags;

  PCell = ^TCell;
  TCell = record
  public
    p: TUnit;
    q: TUnit;
    procedure SetPixel(x,y: integer; InP: Boolean = true);
    function GeneratePtoQ: cardinal; inline;
    function GenerateQtoP: cardinal; inline;
  end;

  //A grid contains pointers to an other grid, a unit or nil.
  //A grid can contain grids (and nils) or units (and nils), but not both.
  PGrid = ^TGrid;
  TGrid = array[0..GridMaxX,0..GridMaxY] of TCell;


  TCellBlock = class(TPersistent)
  private
    FHasCells: boolean;
    FLevel: integer;
    FGrid: TGrid;
    ToDoP: TToDoList;
    ToDoQ: TToDoList;
    PCount: integer;
    QCount: integer;
    FParent: TCellBlock;
    FMyX,FMyY: integer;
    N,W,NW: TCellBlock;
    S,E,SE: TCellBlock;
    procedure GeneratePtoQ; virtual;
    procedure GenerateQtoP; virtual;
    procedure UpdateFlagsPtoQ; virtual;
    procedure UpdateFlagsQtoP; virtual;
    procedure Generate; virtual;
    procedure Display(ACanvas: TCanvas); virtual;
    procedure SetPixel(x,y: integer);
    property Grid: TGrid read FGrid write FGrid;
  public
    constructor Create(AParent: TCellBlock);
    destructor Destroy; override;
    property Parent: TCellBlock read FParent;
    property HasCells: boolean read FHasCells;
    property Level: integer read FLevel;
    property MyX: integer read FMyX;
    property MyY: integer read FMyY;
  end;

  TCellParent = class(TCellBlock)
  private
    procedure GeneratePtoQ; override;
    procedure GenerateQtoP; override;
    //procedure Display(Startx,StartY: integer; ACanvas: TCanvas); override;

  public
    constructor CreateFromChild(AChild: TCellBlock; ChildX, ChildY: integer);
    constructor CreateFromParent(AParent: TCellParent);
    destructor Destroy; override;
  end;

type
  TForm1 = class(TForm)
    ToolBar1: TToolBar;
    ToolButton9: TToolButton;
    ToolButton1: TToolButton;
    ToolButton2: TToolButton;
    ToolButton3: TToolButton;
    ToolButton4: TToolButton;
    ActionList1: TActionList;
    FileNew1: TAction;
    FileOpen1: TAction;
    FileSave1: TAction;
    FileSaveAs1: TAction;
    FileExit1: TAction;
    EditCut1: TEditCut;
    EditCopy1: TEditCopy;
    EditPaste1: TEditPaste;
    HelpAbout1: TAction;
    StatusBar: TStatusBar;
    ImageList1: TImageList;
    Image1: TImage;
    Timer1: TTimer;
    Label1: TLabel;
    procedure FileNew1Execute(Sender: TObject);
    procedure FileSave1Execute(Sender: TObject);
    procedure FileExit1Execute(Sender: TObject);
    procedure Timer1Timer(Sender: TObject);
    procedure FileOpen1Execute(Sender: TObject);
    procedure ToolButton4Click(Sender: TObject);
  private
    MyBlock: TCellBlock;
    MyBitmap: TBitmap;
    BitmapData: array[0..1024,0..(1024 div 32)] of integer;
    procedure InitLookupTable;
    procedure RestartScreen;
  public
    { Public declarations }
  end;



var
  Form1: TForm1;

const
  cLiveCell =      $88888888;
  cLiveVerticalP = $40404040;
  cLiveVerticalQ = $04040404;
  cLiveTop =       $00000088;
  cLiveBottom =    $88000000;
  cLivePCorner =   $00000040;
  cLiveQCorner =   $04000000;

  cUnstableCell =      $22222222;
  cUnstableVerticalP = $10101010;
  cUnstableVerticalQ = $01010101;
  cUnstableTop =       $00000022;
  cUnstableBottom =    $22000000;
  cUnstablePCorner =   $00000010;
  cUnstableQCorner =   $01000000;

  cAllDead = $00000000;
  cAllLive = $ffffffff;

  cLiveRow = $8;
  cLive2x2 = $4;
  cUnstableRow = $2;
  cUnstable8x4 = $22;
  cUnstable2x2 = $1;
  cUnstable2x4 = $11;

  cStateMask: array [0..7] of cardinal =
    ($fffffff0, $ffffff0f, $fffff0ff, $ffff0fff, $fff0ffff, $ff0fffff, $f0ffffff, $0fffffff);

var
  LookupTable: array[0..$FFFF] of byte;
  Generation: int64;

implementation

uses about, sysutils, clipbrd, Math;

{$R *.dfm}

type
  bool = longbool;

procedure getCPUticks(var i : int64);
begin
 asm
   mov ECX,i;
   RDTSC;       //cpu clock in EAX,EDX
   mov [ECX],EAX;
   mov [ECX+4],EDX;
 end;
end;



function IntToBin(AInt: integer): string;
var
  i: integer;
begin
  i:= SizeOf(AInt)*8;
  Result:= StringOfChar('0',i);
  while (i > 0) do begin
    if Odd(AInt) then Result[i]:= '1';
    AInt:= AInt shr 1;
    Dec(i);
  end; {while}
end;


constructor TCellBlock.Create(AParent: TCellBlock);
begin
  inherited Create;
  FParent:= AParent;
  ToDoQ[-1]:= $ffffffff;
  ToDoP[-1]:= $ffffffff;
end;

destructor TCellBlock.Destroy;
begin
  inherited Destroy;
end;


procedure TCell.SetPixel(x: Integer; y: Integer; InP: Boolean = true);
var
  Mask: cardinal;
  Offset: Integer;
begin
  //0,0 is the topleft pixel, no correction for p,q fase.
  x:= x mod 8;
  y:= y mod 4;
  Offset:= x * 4 + y;
  Mask:= 1 shl Offset;
  if (InP) then p:= p or Mask else q:= q or Mask;
end;

procedure TCellBlock.SetPixel(x: Integer; y: Integer);
var
  GridX, GridY: integer;
  x1,y1: integer;
  i: integer;
begin
  x:= x + (GridSizeX div 2) * CellSizeX;
  y:= y + (GridSizeY div 2) * CellSizeY;
  if Odd(Generation) then begin
    Dec(x); Dec(y);
    QCount:= 0;
  end
  else PCount:= 0;
  GridX:= x div CellSizeX;
  GridY:= y div CellSizeY;
  if (GridX in [0..GridMaxX]) and (GridY in [0..GridMaxY]) then begin
    Grid[GridX,GridY].SetPixel(x,y);
    i:= 0;
    for x1:= 1 to GridMaxX-1 do begin
      for y1:= 1 to GridMaxY-1 do begin
        case Odd(Generation) of
          false: begin
            ToDoP[i]:= (x1 shl 16 or y1);
            Inc(PCount);
          end;
          true: begin
            ToDoQ[i]:= (x1 shl 16 or y1);
            Inc(QCount);
          end;
        end; {case}
        Inc(i);
      end; {for y}
    end; {for x}
  end; {if}
end;

//GeneratePtoQ
//This procedure generates the Q data and QState-flags
//using the P-data and PState-flags.

procedure TCellBlock.Generate;
begin
  if Odd(Generation) then GenerateQtoP
  else GeneratePtoQ;
  Inc(Generation);
end;

const
  MaskS =  $cccccccc;
  MaskE =  $ff000000;
  MaskSE = $cc000000;

procedure TCellBlock.GeneratePtoQ;
var
  x,y: integer;
  i: integer;
  Change: cardinal;
  ToDoA: TToDoList;
  ToDoB: TToDoList;
  A, B: integer;
  done: boolean;
  Address: cardinal;
begin
  i:= 0;
  A:= 0; B:= 0;
  ToDoA[-1]:= $ffffffff;
  ToDoB[-1]:= $ffffffff;
  while (i < PCount) do begin
    y:= ToDoP[i] and $FFFF;
    x:= ToDoP[i] shr 16;
    Inc(i);
    if (x = GridMaxX) or (y = GridMaxY) then continue; //Skip the loop.
    Change:= Grid[x,y].GeneratePtoQ;
    if (Change <> 0) then begin
      Address:= (x shl 16 or y);
      if ToDoA[A-1] <> Address then begin
        ToDoA[A]:= Address; Inc(A);
      end;
      if (Change and MaskS) <> 0 then begin
        ToDoA[A]:= Address + 1;
        Inc(A);
      end; {if S changed}
      if ((Change and MaskE) <> 0) then begin
        Address:= Address + (1 shl 16);
        if ToDoB[B-1] <> Address then begin
          ToDoB[B]:= Address;
          Inc(B);
        end;
        if ((Change and MaskSE) <> 0) then begin
          ToDoB[B]:= Address + 1;
          Inc(B);
        end; {if SE changed}
      end; {if E changed}
    end; {if whole cell changed}
  end; {while}
  ToDoA[A]:= $ffffffff;
  ToDoB[B]:= $ffffffff;
  ToDoB[B+1]:= $ffffffff;


  a:= 0; b:= 0; QCount:= 0;
  Done:= (ToDoA[a] = $ffffffff) and (ToDoB[b] = $ffffffff);
  while not done do begin
    if ToDoA[a] <= ToDoB[b] then begin
      ToDoQ[QCount]:= ToDoA[a]; inc(a); inc(QCount);
      if ToDoA[a] = ToDoB[b] then inc(b);
    end
    else begin
      ToDoQ[QCount]:= ToDoB[b]; inc(b); inc(QCount);
    end;
    Done:= (ToDoA[a] = $ffffffff) and (ToDoB[b] = $ffffffff);
  end; {while}
end;

const
  MaskN =  $33333333;
  MaskW =  $000000ff;
  MaskNW = $00000033;

procedure TCellBlock.GenerateQtoP;
var
  x,y: integer;
  i: integer;
  Change: cardinal;
  ToDoA: TToDoList;
  ToDoB: TToDoList;
  A, B: integer;
  done: boolean;
  Address: cardinal;
begin
  i:= 0;
  A:= 0; B:= 0;
  ToDoA[-1]:= $ffffffff;
  ToDoB[-1]:= $ffffffff;
  while (i < QCount) do begin
    y:= ToDoQ[i] and $FFFF;
    x:= ToDoQ[i] shr 16;
    Inc(i);
    if (x = 0) or (y = 0) then Continue; //Skip the rest of the loop.
    Change:= Grid[x,y].GenerateQtoP;
    if (Change <> 0) then begin
      Address:= (x shl 16 or y);
      if ToDoA[A-1] <> Address then begin
        ToDoA[A]:= Address; Inc(A);
      end;
      if (Change and MaskN) <> 0 then begin
        ToDoA[A]:= Address - 1;
        Inc(A);
      end; {if N changed}
      if ((Change and MaskW) <> 0) then begin
        Address:= Address - (1 shl 16);
        if ToDoB[B-1] <> Address then begin
          ToDoB[B]:= Address;
          Inc(B);
        end;
        if ((Change and MaskNW) <> 0) then begin
          ToDoB[B]:= Address - 1;
          Inc(B);
        end; {if NW changed}
      end; {if W changed}
    end; {if whole cell changed}
  end; {while}
  ToDoA[A]:= $ffffffff;
  ToDoB[B]:= $ffffffff;
  ToDoB[B+1]:= $ffffffff;

  a:= 0; b:= 0; PCount:= 0;
  Done:= (ToDoA[a] = $ffffffff) and (ToDoB[b] = $ffffffff);
  while not done do begin
    if ToDoA[a] <= ToDoB[b] then begin
      ToDoP[PCount]:= ToDoA[a]; inc(a); inc(PCount);
      if ToDoA[a] = ToDoB[b] then inc(b);
    end
    else begin
      ToDoP[PCount]:= ToDoB[b]; inc(b); inc(PCount);
    end;
    Done:= (ToDoA[a] = $ffffffff) and (ToDoB[b] = $ffffffff);
  end; {while}
end;


(*
procedure TCellBlock.GenerateQtoP;
var
  x,y: integer;
  i: integer;
  Change: cardinal;
begin
  i:= 0;
  for x:= 0 to GridMaxX - 1 do begin
    for y:= 0 to GridMaxY -1 do begin
      if Active[cIndexQ][x, y] then begin
        Active[cIndexQ][x, y]:= false;
        ToDo[i]:= x shl 16 or y;
        Inc(i);
      end; {if}
    end; {for y}
  end; {for x}
  while i > 0 do begin
    Dec(i);
    y:= ToDo[i] and $FFFF;
    x:= ToDo[i] shr 16;
    Change:= Grid[x,y].GenerateQtoP;
    Active[cIndexP][x,y]:= Active[cIndexP][x,y] or (Change <> 0);
    Active[cIndexP][x-1,y-1]:= Active[cIndexP][x-1,y-1] or ((Change and $00000033) <> 0);
    Active[cIndexP][x-1,y]:= Active[cIndexP][x-1,y] or     ((Change and $000000ff) <> 0);
    Active[cIndexP][x,y-1]:= Active[cIndexP][x,y-1] or     ((Change and $33333333) <> 0);
  end; {while}
end; (**)

procedure TCellBlock.UpdateFlagsPtoQ;
begin
  //nog in te vullen.
end;

procedure TCellBlock.UpdateFlagsQtoP;
begin
  //nog in te vullen
end;



function TCell.GeneratePtoQ: cardinal;
var
  NewQ: cardinal;
  Change: cardinal;
const
  Mask1 = $f;
  Mask2 = $ff;
  Mask4 = $ffff;

  Row1Mask = $33333333; //0011-0011-0011-0011-0011-0011-0011-0011
  Row2Mask = $cccccccc; //1100-1100-1100-1100-1100-1100-1100-1100

  function MakeNewBrick(p0,p1,p2,p3: cardinal): cardinal; inline;
  var
    Row1, Row2: cardinal;
  begin
    //Generate new Brick using a 2x2 grid of bricks ordered like:
    //p0 p1
    //p2 p3


    //First row inside P0
    if (p0 <> 0) then Row1:=
      LookupTable[p0 and $ffff] or
      LookupTable[(p0 shr 8) and $ffff] shl 8 or
      LookupTable[(p0 shr 16)] shl 16 or
      LookupTable[(p0 shr 24) or (p1 and $ff) shl 8] shl 24
    else Row1:= LookupTable[(p1 and $ff) shl 8] shl 24;
 (**)
    p0:= ((p0 and $cccccccc)) or ((p2 and $33333333));
    p1:= ((p1 and $cc)) or ((p3 and $33));
    if (p0 <> 0) then Row2:=
      LookupTable[p0 and $ffff] or
      LookupTable[(p0 shr 8) and $ffff] shl 8 or
      LookupTable[(p0 shr 16)] shl 16 or
      LookupTable[(p0 shr 24) or ((p1 and $ff) shl 8)] shl 24
    else Row2:= LookupTable[(p1 and $ff) shl 8] shl 24;

    Result:= (Row1 and Row1Mask) or (Row2 and Row2Mask);
  end;

begin
  NewQ:= MakeNewBrick(Self.p, PGrid(@Self)^[1,0].p, PGrid(@Self)^[0,1].p, PGrid(@Self)^[1,1].p);
  Result:= NewQ xor q;
  q:= NewQ;
end;



function TCell.GenerateQtoP: cardinal;
var
  Offset: integer;
  NewP: cardinal;
  Change: cardinal;

const
  Row1Mask = $33333333; //0011-0011-0011-0011-0011-0011-0011-0011
  Row2Mask = $cccccccc; //1100-1100-1100-1100-1100-1100-1100-1100

  function MakeNewBrick(q0,q1,q2,q3: cardinal): cardinal; inline;
  var
    Row1, Row2: cardinal;
  begin
    //Generate new Brick using a 2x2 grid of bricks ordered like:
    //q3 q2
    //q1 q0

    if (q0 <> 0) then Row1:=
      LookupTable[(q0 shr 16)] shl 26 or
      LookupTable[(q0 shr 8 ) and $ffff] shl 18 or
      LookupTable[(q0       ) and $ffff] shl 10 or
      LookupTable[((q0 and $ff) shl 8) or (q1 shr 24)] shl 2
    else Row1:= LookupTable[(q1 shr 24)] shl 2;

(*
    q0:= ((q0 and $33333333) shl 2) or ((q2 and $cccccccc) shr 2);
    q1:= ((q1 and $33000000) shl 2) or ((q3 and $cc000000) shr 2);
    if (q0 <> 0) then Row2:=
      LookupTable[(q0 shr 16) and $ffff] shl 24 or
      LookupTable[(q0 shr 8) and $ffff] shl 16 or
      LookupTable[(q0      ) and $ffff] shl 8 or
      LookupTable[((q0 and $ff) shl 8) or (q1 shr 24)]
    else Row2:= LookupTable[(q1 shr 24)];
(**)

    q0:= ((q0 and $33333333)) or ((q2 and $cccccccc));
    q1:= ((q1 and $33000000)) or ((q3 and $cc000000));
    if (q0 <> 0) then Row2:=
      LookupTable[(q0 shr 16)] shl 22 or
      LookupTable[(q0 shr 8) and $ffff] shl 14 or
      LookupTable[(q0      ) and $ffff] shl 6 or
      LookupTable[((q0 and $ff) shl 8) or (q1 shr 24)] shr 2
    else Row2:= LookupTable[(q1 shr 24)] shr 2;

    Result:= (Row1 and Row2Mask) or (Row2 and Row1Mask);
  end;



begin
  Offset:= -1;
  NewP:= MakeNewBrick(Self.q, PGrid(@Self)^[Offset,0].q, PGrid(@Self)^[0,Offset].q, PGrid(@Self)^[Offset, Offset].q);
  Result:= NewP xor P;
  P:= NewP;
end;

procedure TCellBlock.Display(ACanvas: TCanvas);
var
  GridX,GridY: integer;
  //Offset: integer;

  procedure DisplayCell(ACell: TCell);
  var
    x,y,x1,y1: integer;
    Row, Mask: integer;
    DoPixel: boolean;
    Offset: integer;
    DrawOffset: integer;
    InP: boolean;
  begin
    DrawOffset:= (Generation and 1);
    InP:= not(Odd(Generation));

    for y:= 0 to CellMaxY do begin
      for x:= 0 to CellMaxX do begin
        //if (x = 0) or (y = 0) then ACanvas.Pixels[GridX*16+x+Offset,GridY*16+y+Offset]:= clBtnFace;
        //0,0 is the topleft pixel, no correction for p,q fase.
        x1:= x mod 8;
        y1:= y mod 4;
        Offset:= x1 * 4 + y1;
        Mask:= 1 shl Offset;
        if (InP) then DoPixel:= (ACell.p and Mask) <> 0
        else DoPixel:= (ACell.q and Mask) <> 0;

        if DoPixel then ACanvas.Pixels[GridX*CellSizeX+x+DrawOffset, GridY*CellSizeY+y+DrawOffset]:= clBlack;
      end; {for x}
    end; {for y}
  end;  (**)


begin
  ACanvas.Rectangle(-1,-1,1000,1000);
  FillChar(Form1.BitmapData, SizeOf(Form1.BitmapData), #0);
  for GridY:= 0 to GridMaxY do begin
    for GridX:= 0 to GridMaxX do begin
      if Int64(Grid[GridX, GridY]) <> 0 then begin
        DisplayCell(Grid[GridX,GridY]);
      end;
    end;
  end;
end;

//--------------------------------------
//A Parent is every layer above the ground level
//the tree grows from the bottom up.
//A new parent is placed on top of the last one and
//always has one and only one child to start with, from there
//the tree grows down again.

constructor TCellParent.CreateFromChild(AChild: TCellBlock; ChildX: Integer; ChildY: Integer);
begin
  inherited Create(nil);
end;

constructor TCellParent.CreateFromParent(AParent: TCellParent);
begin
  inherited Create(AParent);
end;

destructor TCellParent.Destroy;
begin
  inherited Destroy;
end;

procedure TCellParent.GeneratePtoQ;
begin
end;

procedure TCellParent.GenerateQtoP;
begin
end;

//The bitmap for the lookup table is as follows:
// 0  2  4  6
//   +----+
// 1 |3  5| 7
// 8 |A  C| E
//   +----+
// 9  B  D  F
// The inner 2x2 cells are looked up.
// so 0241358AC make up bit 3 etc.

procedure TForm1.InitLookupTable;
const
  //Masks for normal order.
  MaskNW = $0757;   //0000-0111-0101-0111
  MaskSW = $0EAE;   //0000-1110-1010-1110
  MaskNE = $7570;   //0111-0101-0111-0000
  MaskSE = $EAE0;   //1110-1010-1110-0000

  //Bitlocations for normal order
  BitNW = $0020; //0000-0000-0010-0000
  BitSW = $0040; //0000-0000-0100-0000
  BitNE = $0200; //0000-0020-0000-0000
  BitSE = $0400; //0000-0100-0000-0000


  //Lookup table also has a shifted order. here the bottom half of the N word
  //and the top half of the south word combine.
  //Like so:
  // 2 6 A E
  // 3 7 B F
  // 0 4 8 C
  // 1 5 9 D

  //Mask for split order.
  Mask2NW = $0D5D;     // 0000-1101-0101-1101
  Mask2SW = $0BAB;     // 0000-1011-1010-1011
  Mask2NE = $D5D0;     // 1101-0101-1101-0000
  Mask2SE = $BAB0;     // 1011-1010-1011-0000

  //Bitlocations for split order
  Bit2NW = $0080; // 0000-0000-1000-0000
  Bit2SW = $0010; // 0000-0000-0001-0000
  Bit2NE = $0800; // 0000-1000-0000-0000
  Bit2SE = $0100; // 0000-0001-0000-0000

  ResultNW = $01;
  ResultSW = $02;
  ResultNE = $10;
  ResultSE = $20;

  Result2NW = $04;
  Result2SW = $08;
  Result2NE = $40;
  Result2SE = $80;

var
  i: integer;
  iNW, iNE, iSW, iSE: cardinal;
  Count: integer;
  ResultByte: byte;

  function GetCount(a: integer): integer;
  var
    c: integer;
  begin
    Result:= 0;
    for c:= 0 to 15 do begin
      if Odd(a shr c) then Inc(Result);
    end; {for c}
  end; {GetCount}

begin
  //Fill the normal lookup.
  for i:= 0 to $ffff do begin
    ResultByte:= 0;
    iNW:= i and MaskNW;
    Count:= GetCount(iNW);
    case Count of   //count excluding bit itself
      3: ResultByte:= ResultNW;
      2: if ((i and BitNW) <> 0) then ResultByte:= ResultNW;
    end;
    iSW:= i and MaskSW;
    Count:= GetCount(iSW);
    case Count of
      3: ResultByte:= ResultByte or ResultSW;
      2: if ((i and BitSW) <> 0) then ResultByte:= ResultByte or ResultSW;
    end;
    iNE:= i and MaskNE;
    Count:= GetCount(iNE);
    case Count of
      3: ResultByte:= ResultByte or ResultNE;
      2: if ((i and BitNE) <> 0) then ResultByte:= ResultByte or ResultNE;
    end;
    iSE:= i and MaskSE;
    Count:= GetCount(iSE);
    case Count of
      3: ResultByte:= ResultByte or ResultSE;
      2: if ((i and BitSE) <> 0) then ResultByte:= ResultByte or ResultSE;
    end;
    LookupTable[i]:= ResultByte;
  end; {for i}


  //Fill the shifted lookup.
  for i:= 0 to $ffff do begin
    ResultByte:= 0;
    iNW:= i and Mask2NW;
    Count:= GetCount(iNW);
    case Count of   //count excluding bit itself
      3: ResultByte:= Result2NW;
      2: if ((i and Bit2NW) <> 0) then ResultByte:= Result2NW;
    end;
    iSW:= i and Mask2SW;
    Count:= GetCount(iSW);
    case Count of
      3: ResultByte:= ResultByte or Result2SW;
      2: if ((i and Bit2SW) <> 0) then ResultByte:= ResultByte or Result2SW;
    end;
    iNE:= i and Mask2NE;
    Count:= GetCount(iNE);
    case Count of
      3: ResultByte:= ResultByte or Result2NE;
      2: if ((i and Bit2NE) <> 0) then ResultByte:= ResultByte or Result2NE;
    end;
    iSE:= i and Mask2SE;
    Count:= GetCount(iSE);
    case Count of
      3: ResultByte:= ResultByte or Result2SE;
      2: if ((i and Bit2SE) <> 0) then ResultByte:= ResultByte or Result2SE;
    end;
    LookupTable[i]:= LookupTable[i] or ResultByte;
  end; {for i}  (**)
end;

procedure TForm1.RestartScreen;
begin
  MyBlock.Free;
  MyBlock:= TCellBlock.Create(nil);
  //MyBlock.SetPixel(5,7);
  //MyBlock.SetPixel(6,7);
  //MyBlock.SetPixel(7,7);
  //MyBlock.SetPixel(7,6);
  //MyBlock.SetPixel(6,5);

  MyBlock.SetPixel(10,0);
  MyBlock.SetPixel(11,0);
  MyBlock.SetPixel(9,1);
  MyBlock.SetPixel(10,1);
  MyBlock.SetPixel(10,2);
end;

procedure TForm1.Timer1Timer(Sender: TObject);
begin
  if Assigned(MyBlock) then begin
    MyBlock.Generate;
    MyBlock.Display(Image1.Canvas);
  end;
end;

procedure TForm1.ToolButton4Click(Sender: TObject);
begin
  if Assigned(MyBlock) then begin
    MyBlock.Generate;
    MyBlock.Display(Image1.Canvas);
  end;
end;

procedure TForm1.FileNew1Execute(Sender: TObject);
begin
  InitLookupTable;
  FillChar(BitmapData, SizeOf(BitmapData), #0);
  MyBitmap:= TBitmap.Create;
  MyBitmap.SetSize(1024,1024);
  MyBitmap.PixelFormat:= pf1bit;
  MyBitmap.Monochrome:= true;
  //MyBitmap.Handle:= CreateBitmap(1000,1000,1,2,nil);

  Generation:= 0;

  RestartScreen;
  MyBlock.Display(Image1.Canvas);
  //if (Sender = FileNew1) then Timer1.Enabled:= not(Timer1.Enabled);

end;

procedure TForm1.FileOpen1Execute(Sender: TObject);
var
  i,a: integer;
  start, eind: int64;
  Diff: double;
  LowDiff: double;
begin
  LowDiff:= MaxInt;
  for a:= 0 to 10 do begin
    FileNew1Execute(Sender);
    GetCPUTicks(start);
    for i:= 0 to 1000 do begin
      MyBlock.Generate;
    end;
    GetCPUTicks(eind);
    //Label1.Caption:= IntToStr(Eind - Start);
    Diff:= Eind - start;
    LowDiff:= Min(Diff, LowDiff);

    Label1.Caption:= Format('%10.0n',[lowdiff]) + ' CPU cycles per 1,000 generations';
    Clipboard.AsText:= Label1.Caption;
  end; {for a}
  MyBlock.Display(Image1.Canvas);
end;


procedure TForm1.FileSave1Execute(Sender: TObject);
begin
  Timer1.Enabled:= not(Timer1.Enabled);
end;

procedure TForm1.FileExit1Execute(Sender: TObject);
begin
  Close;
end;


initialization
  Generation:= 0;

end.

由于大小限制,Stackoverflow不允许我发布表单文件,但我希望你可以不用管理。