Question

我希望根据每个节点的配置统一分配流量到各个节点。最多可以有100个节点和流量百分比分配给多个节点可以配置。

所以说，如果有4个节点： -

node 1 - 20
node 2 - 50
node 3 - 10
node 4 - 20
------------
sum    - 100
------------

所有节点的值总和应为100。示例： -

node 1 - 50
node 2 - 1
node 3 - 1
node 4 - 1
.
.
.
node 100 - 1

在上述配置中，共有51个节点。节点1为50，其余50个节点配置为1。

在一个Senario中，请求可以按以下方式分发： - 节点1，节点2，节点3，节点4，节点5，...，node51，节点1，节点1，节点1，节点1，节点1，节点1，节点1，......

以上分发效率低下，因为我们不断向node1发送过多的流量，这可能导致node1拒绝请求。

在另一个Senario中，请求可以按以下方式分发： - 节点1，节点2，节点1，节点3，节点1，节点4，节点1，节点5，节点1，node6，节点1，node7，节点1，...... node8

在上面的Senario请求中得到了更有效的支持。

我找到了以下代码，但无法理解其背后的想法。

func()
{
  for(int itr=1;itr<=total_requests+1;itr++)
  {
      myval = 0;           
      // Search the node that needs to be incremented
       // to best approach the rates of all branches                      
      for(int j=0;j<Total_nodes;j++)
      {

         if((nodes[j].count*100/itr > nodes[j].value) ||
           ((nodes[j].value - nodes[j].count*100/itr) < myval) ||
           ((nodes[j].value==0 && nodes[j].count ==0 )))
              continue;

            cand = j;
            myval = abs((long)(nodes[j].count*100/itr - nodes[j].value));
       }
       nodes[cand].count++;

  }

  return nodes[cand].nodeID;
}

在上面的代码中，total_requests是到目前为止收到的请求总数。 total_requests变量将每次递增，将其视为用于理解目的的全局值。

Total_nodes，是配置的节点总数，每个节点使用以下结构表示。

节点是一个结构： -

struct node{
  int count;
  int value;
  int nodeID;
};

例如： -

If 4 nodes are configured :-
node 1 - 20
node 2 - 50
node 3 - 10
node 4 - 20
------------
sum    - 100
------------

将使用以下值创建四个节点[4]： -

node1{
  count = 0;
  value = 20;
  nodeID = 1;
};

node2{
  count = 0;
  value = 50;
  nodeID = 2;
};

node3{
  count = 0;
  value = 10;
  nodeID = 3;
};

node4{
  count = 0;
  value = 20;
  nodeID = 4;
};

请你解释一下它如何有效地分配它的算法或想法。

Answer 1

nodes[j].count*100/itr是节点j到目前为止已回答的请求百分比的最低点。 nodes[j].value是节点j应回答的请求的百分比。您发布的代码会查找落后于目标百分比最远的节点（或多或少，受整数除法的影响）并为其分配下一个请求。

Answer 2

嗯。似乎当你达到100个节点时，每个节点必须占用1％的流量？

老实说，我不知道你提供的功能是什么。我假设它试图找到距离其长期平均负载最远的节点。但是，如果total_requests是迄今为止的总数，那么我不会得到外部for(int itr=1;itr<=total_requests+1;itr++)循环正在做的事情，除非它实际上是某些测试的一部分，以显示它如何分配负载？

无论如何，基本上你所做的与构建非均匀随机序列类似。最多100个节点，如果我可以假设（暂时）0..999提供足够的分辨率，那么你可以使用＆＃34; id_vector []＆＃34;具有1000个node-id，其中填充有节点-1的ID的n1个副本，节点2的ID的n2个副本，依此类推 - 其中node-1将接收n1 / 1000个交通，等等。决策过程非常非常简单 - 选择id_vector [random（）％1000]。随着时间的推移，节点将接收大约相应的流量。

如果您对随机分配流量不满意，那么您可以使用node-id为id_vector播种，以便您可以选择＆＃34; round-robin＆＃34;并为每个节点获得合适的频率。一种方法是随机洗牌上面构造的id_vector（也许偶尔会重新洗牌，这样如果一次洗牌是＆＃34;坏＆＃34;，你就不会被它搞砸了）。或者你可以做一次性漏桶事件并从中填充id_vector。每次围绕id_vector，这保证每个节点将接收其分配的请求数。

您制作id_vector的细粒度越好，您对每个节点的短期请求频率的控制就越好。

请注意，上述所有内容都假设节点的相对负载是恒定的。如果没有，那么你需要（非常现在然后？）调整id_vector。

编辑以按要求添加更多详细信息...

...假设我们只有5个节点，但我们表达了＆＃34; weight＆＃34;每个节点为n/1000，允许最多100个节点。假设他们有ID 1..5和权重：

  ID=1, weight = 100
  ID=2, weight = 375
  ID=3, weight = 225
  ID=4, weight = 195
  ID=5, weight = 105

显然，加起来为1000。

所以我们构建一个id_vector[1000]，以便：

  id_vector[  0.. 99] = 1   -- first 100 entries = ID 1
  id_vector[100..474] = 2   -- next  375 entries = ID 2
  id_vector[475..699] = 3   -- next  225 entries = ID 3
  id_vector[700..894] = 4   -- next  195 entries = ID 4
  id_vector[100..999] = 5   -- last  105 entries = ID 5

现在，如果我们改变id_vector[]，我们会得到一个随机的节点选择序列，但超过1000个请求，是对每个节点的请求的正确平均频率。

为了娱乐价值，我去了一个漏水桶＃34;通过使用每个节点的一个漏桶填充id_vector来查看它能够保持对每个节点的稳定请求频率的程度。下面将包含执行此操作的代码，以及它的执行情况以及简单随机版本的执行情况。

每个漏桶在下一个请求发送到此漏洞之前，应该发送（到其他节点）的请求数量cc。每次调度请求时，所有桶的cc计数递减，并且其桶具有最小cc（或cc相等的最小id）的节点被发送到请求，并且此时节点的存储区cc被重新充值。（每个请求都会导致所有桶都被滴下一次，并且所选节点的存储桶会重新充电。）

cc是存储桶＆＃34;内容＆＃34;的整数部分。 cc的初始值为q = 1000 / w，其中w是节点的权重。每次充值时，q都会添加到cc。然而，为了准确地做事，我们需要处理余数r = 1000 % w ......或者换句话说，＆＃34;内容＆＃34;有一个小数部分 - 这是cr进来的地方。内容的真值是cc + cr / w（其中cr / w是真正的分数，而不是整数除法）。其初始值为cc = q和cr = r。每次充值时，q都会添加到cc，r会添加到cr。当cr / w> = 1/2时，我们向上舍入，因此cc +=1和cr -= w（在整数部分中加1除以1减去1 - 即w / w - 来自分数部分）。为了测试cr / w＆gt; = 1/2，代码实际上测试(cr * 2) >= w。希望bucket_recharge()函数有意义（现在）。

泄漏桶运行1000次以填充id_vector []。一点点的测试表明，这为所有节点保持了相当稳定的频率，并且每次在id_vector []周期内每个节点都有一个确切的数据包数。

一点点的测试表明，random（）shuffle方法在每个id_vector []循环中具有更多可变频率，但仍然为每个循环提供每个节点的确切数据包数。

漏桶的稳定性假定了稳定的传入请求流。这可能是一个完全不切实际的假设。如果请求到达的大（与id_vector []循环相比较大，在此示例中为1000）突发，那么（简单）random（）shuffle方法的可变性可能会因请求到达的可变性而相形见绌！

enum
{
  n_nodes  =    5,        /* number of nodes      */
  w_res    = 1000,        /* weight resolution    */
} ;

struct node_bucket
{
  int   id ;            /* 1 origin                 */

  int   cc ;            /* current count            */
  int   cr ;            /* current remainder        */

  int   q ;             /* recharge -- quotient     */
  int   r ;             /* recharge -- remainder    */

  int   w ;             /* weight                   */
} ;

static void bucket_recharge(struct node_bucket* b) ;
static void node_checkout(int weights[], int id_vector[], bool rnd) ;
static void node_shuffle(int id_vector[]) ;

/*------------------------------------------------------------------------------
 * To begin at the beginning...
 */
int
main(int argc, char* argv[])
{
  int node_weights[n_nodes] = { 100, 375, 225, 195, 105 } ;
  int id_vector[w_res] ;
  int cx ;

  struct node_bucket buckets[n_nodes] ;

  /* Initialise the buckets -- charged
   */
  cx = 0 ;
  for (int id = 0 ; id < n_nodes ; ++id)
    {
      struct node_bucket* b ;

      b = &buckets[id] ;

      b->id = id + 1 ;              /* 1 origin     */
      b->w  = node_weights[id] ;

      cx += b->w ;

      b->q  = w_res / b->w ;
      b->r  = w_res % b->w ;

      b->cc = 0 ;
      b->cr = 0 ;

      bucket_recharge(b) ;
    } ;

  assert(cx == w_res) ;

  /* Run the buckets for one cycle to fill the id_vector
   */
  for (int i = 0 ; i < w_res ; ++i)
    {
      int id ;

      id = 0 ;
      buckets[id].cc -= 1 ;         /* drip     */

      for (int jd = 1 ; jd < n_nodes ; ++jd)
        {
          buckets[jd].cc -= 1 ;     /* drip     */

          if (buckets[jd].cc < buckets[id].cc)
            id = jd ;
        } ;

      id_vector[i] = id + 1 ;       /* '1' origin   */

      bucket_recharge(&buckets[id]) ;
    } ;

  /* Diagnostics and checking
   *
   * First, check that the id_vector contains exactly the right number of
   * each node, and that the bucket state at the end is the same (apart from
   * cr) as it is at the beginning.
   */
  int nf[n_nodes] = { 0 } ;

  for (int i = 0 ; i < w_res ; ++i)
    nf[id_vector[i] - 1] += 1 ;

  for (int id = 0 ; id < n_nodes ; ++id)
    {
      struct node_bucket* b ;

      b = &buckets[id] ;

      printf("ID=%2d weight=%3d freq=%3d  (cc=%3d  cr=%+4d  q=%3d  r=%3d)\n",
                                b->id, b->w, nf[id], b->cc, b->cr, b->q, b->r) ;
    } ;

  node_checkout(node_weights, id_vector, false /* not random */) ;

  /* Try the random version -- with shuffled id_vector.
   */
  int iv ;

  iv = 0 ;
  for (int id = 0 ; id < n_nodes ; ++id)
    {
      for (int i = 0 ; i < node_weights[id] ; ++i)
        id_vector[iv++] = id + 1 ;
    } ;
  assert(iv == 1000) ;

  for (int s = 0 ; s < 17 ; ++s)
    node_shuffle(id_vector) ;

  node_checkout(node_weights, id_vector, true /* random */) ;

  return 0 ;
} ;

static void
bucket_recharge(struct node_bucket* b)
{
  b->cc += b->q ;
  b->cr += b->r ;

  if ((b->cr * 2) >= b->w)
    {
      b->cc += 1 ;
      b->cr -= b->w ;
    } ;
} ;

static void
node_checkout(int weights[], int id_vector[], bool rnd)
{
  struct node_test
  {
    int   last_t ;
    int   count ;
    int   cycle_count ;
    int   intervals[w_res] ;
  } ;

  struct node_test tests[n_nodes] = { { 0 } } ;

  printf("\n---Test Run: %s ---\n", rnd ? "Random Shuffle" : "Leaky Bucket") ;

  /* Test run
   */
  int s ;
  s = 0 ;
  for (int t = 1 ; t <= (w_res * 5) ; ++t)
    {
      int id ;

      id = id_vector[s++] - 1 ;

      if (tests[id].last_t != 0)
        tests[id].intervals[t - tests[id].last_t] += 1 ;

      tests[id].count += 1 ;
      tests[id].last_t = t ;

      if (s == w_res)
        {
          printf("At time %4d\n", t) ;

          for (id = 0 ; id < n_nodes ; ++id)
            {
              struct node_test*   nt ;
              long   total_intervals ;

              nt = &tests[id] ;

              total_intervals = 0 ;
              for (int i = 0 ; i < w_res ; ++i)
                total_intervals += (long)i * nt->intervals[i] ;

              printf("  ID=%2d weight=%3d count=%4d(+%3d)  av=%6.2f vs %6.2f\n",
                        id+1, weights[id], nt->count, nt->count - nt->cycle_count,
                                          (double)total_intervals / nt->count,
                                          (double)w_res / weights[id]) ;
              nt->cycle_count = nt->count ;

              for (int i = 0 ; i < w_res ; ++i)
                {
                  if (nt->intervals[i] != 0)
                    {
                      int h ;

                      printf("  %6d x %4d ", i, nt->intervals[i]) ;

                      h = ((nt->intervals[i] * 75) + ((nt->count + 1) / 2))/
                                                                     nt->count ;
                      while (h-- != 0)
                        printf("=") ;
                      printf("\n") ;
                    } ;
                } ;
            } ;

          if (rnd)
            node_shuffle(id_vector) ;

          s = 0 ;
        } ;
    } ;
} ;

static void
node_shuffle(int id_vector[])
{
  for (int iv = 0 ; iv < (w_res - 1) ; ++iv)
    {
      int is, s ;

      is = (int)(random() % (w_res - iv)) + iv ;

      s             = id_vector[iv] ;
      id_vector[iv] = id_vector[is] ;
      id_vector[is] = s ;
    } ;
} ;

算法有效地向节点发送请求

2 个答案: