Question

我想知道是否有一种快速的方法可以让fortran遍历maxtrix的行并确定n个术语是否相等。我找不到类似于我的问题，也找不到在线帮助。

Answer 1

假设我们考虑一个整数矩阵，它的成本为O（N³），N为矩阵的维数。本质上，对于每一行，您需要将每个元素与该行中的每个其他元素进行比较，这需要进行O（N³）个运算。您可能需要自己编写，但这没什么大不了的，请遍历行并分别检查每个行（如果某个元素出现n次）

integer :: M(N, N)              ! matrix to check
integer :: n, i, j, k, counter  ! flag if a value appears n times
logical :: appears

appears = .false.
do i = 1, N      ! loop over the rows
  do j = 1, N    ! loop over the entries
    counter = 1
    do k = j + 1, N
      ! check if the elements are the same, if yes, increase the counter
      ! exact implementation depends on type of M
      if(M(i, k) == M(i, j)) counter = counter + 1
      ! check if this element appears n times
      if(counter == n) appears = .true.
      ! or even more often?
      if(counter > n) appears = .false.
    end do 
  end do
end do

您可以根据需要进行调整，但是您可以这样做。

Answer 2

我想知道是否有一种快速的方法可以让fortran遍历maxtrix的行并确定n个术语是否相等。

据我了解您的问题，这就是您想要的：

签名为(integer(:), integer) -> logical
此函数接收一维数组line并检查数组中是否有任何值出现至少 n次
该函数不应该用来指示什么或多少是那些值，它们的位置或确切的重复次数

有很多方法可以实现这一目标。 “最有效的是什么？” 这取决于您的数据，系统，编译器等的特定条件。为了说明这一点，我提出了3种不同的解决方案。当然，所有这些都给出了正确的答案。建议您使用真实数据的样本对它们（或您提出的其他任何一个）进行测试。

天真的解决方案＃1-好的'ol do loops

这是默认算法。它遍历line并将每个值存储到聚合器列表packed中，该列表到目前为止已找到每个不同的值，以及它们出现了多少次。在任何值达到n重复的瞬间，该函数返回.true.。如果没有达到n重复的值，并且没有更多的机会完成谓词，则返回.false.。

我说 defalut 是因为它是基于好的ol'do循环的最小 linear 算法（据我了解）。如果您对数据，系统甚至编程语言的性质的信息零，那么这对于一般情况可能是最好的。只要满足条件，聚合器就会在那里终止功能，但是要付出额外的列表遍历（在其长度上）。如果数据中有许多不同的值并且n大，则聚合器将变得太长，并且查找可能会成为昂贵的操作。此外，几乎没有并行，矢量化和其他优化的空间。

! generic approach, with loops and aggregator
pure logical function has_at_least_n_repeated(line, n)
  integer, intent(in) :: line(:), n
  integer :: i, j, max_repetitions, qty_distincts
  ! packed(1,:) -> the distinct integers found so far
  ! packed(2,:) -> number of repetitions of each distinct integer so far
  integer :: packed(2, size(line) - n + 2)

  if(n < 1 .or. size(line) == 0) then
    has_at_least_n_repeated = .false.
  else if(n == 1) then
    has_at_least_n_repeated = .true.
  else
    packed(:, 1) = [line(1), 1]
    qty_distincts = 1
    max_repetitions = 1
    i = 1
    ! iterate until there aren't enough elements left to reach n repetitions
    outer: do, while(i - max_repetitions <= size(line) - n)
      i = i + 1
      ! test for a match on packed
      do j = 1, qty_distincts
        if(packed(1, j) == line(i)) then
          packed(2, j) = packed(2, j) + 1
          if(packed(2, j) == n) then
            has_at_least_n_repeated = .true.
            return
          end if
          max_repetitions = max(max_repetitions, packed(2, j))
          cycle outer
        end if
      end do
      ! add to packed
      qty_distincts = qty_distincts + 1
      packed(:, qty_distincts) = [line(i), 1]
    end do outer
    has_at_least_n_repeated = .false.
  end if
end

天真的解决方案＃2-尝试进行矢量化

这种方法试图利用Fortran的 arraysh 特性和内在函数的快速实现。除了内部的do循环外，还使用数组参数调用了内部count，从而允许编译器进行一些矢量化处理。另外，如果您讨厌任何用于并行的工具，或者您知道如何使用协数组（并且编译器支持），则可以使用这种方法来实现它们。

此处的缺点是该函数将扫描所有元素，即使它们之前出现过也是如此。因此，当数据中有许多不同的可能值且重复次数很少时，此方法更合适。不过，添加带有过去值的缓存列表并使用内部any，将缓存作为一个整体传递，也很容易。

! alternative approach, intrinsic functions without cache
pure logical function has_at_least_n_repeated(line, n)
  integer, intent(in) :: line(:), n
  Integer :: i

  if(n < 1 .or. size(line) == 0) then
    has_at_least_n_repeated = .false.
  else if(n == 1) then
    has_at_least_n_repeated = .true.
  else
    ! iterate until there aren't enough elements left to reach n repetitions
    do i = 1, size(line) - n + 1
      if(count(line(i + 1:) == line(i)) + 1 >= n) then
        has_at_least_n_repeated = .true.
        return
      end if
    end do
    has_at_least_n_repeated = .false.
  end if
end

天真的解决方案＃3-功能风格

这是我的最爱（个人标准）。我喜欢函数式语言，并且喜欢将其某些方面借用为命令性语言。这种方法将计算委托给内部辅助递归函数。这里没有do循环。在每个函数调用中，只有line的一部分作为参数传递：一个较短的数组，其中只有到目前为止尚未检查的值。也不需要缓存。

说实话，Fortran对递归的支持远非如此-没有尾递归，编译器通常实现较低的调用堆栈限制，并且递归阻止了许多自动优化。即使该算法很聪明，我也喜欢它的外观，并且在进行一些测试和比较之前也不会丢弃它。

注意：Fortran不允许在主程序的contains部分使用嵌套过程。为了使它像所展示的那样工作，您需要将函数放在模块，子模块中或使其成为外部函数。另一种选择是提取嵌套函数并使其在相同范围内成为普通函数。

! functional approach, auxiliar recursive function and no loops
pure logical function has_at_least_n_repeated(line, n)
  integer, intent(in) :: line(:), n

  if(n < 1 .or. size(line) == 0) then
    has_at_least_n_repeated = .false.
  else if(n == 1) then
    has_at_least_n_repeated = .true.
  else
    has_at_least_n_repeated = aux(line)
  end if

contains
  ! on each iteration removes all entries of an element from array
  pure recursive function aux(section) result(out)
    integer, intent(in) :: section(:)
    logical :: out, mask(size(section))
    integer :: left
    mask = section /= section(1)
    left = count(mask)
    if(size(section) - left >= n) then
      out = .true.
    else if(n > left) then
      out = .false.
    else
      out = aux(pack(section, mask))
    end if
  end
end

结论

进行测试，然后选择要遵循的路径！我在这里谈到了我对每种方法的个人感觉及其含义，但是，如果该站点上的某些 Fortran Gurus 加入讨论并提供批评者的准确信息，那将是非常不错的。

Answer 3

这是@RodrigoRodrigues已经提供的解决方案的务实替代方案。在没有任何充分的证据（这个问题严重未明确说明）的情况下，我们需要关注渐近复杂性和所有这些好东西，这是一个简单直接的函数，花了我大约5分钟的时间进行设计，编码和测试。 >

此函数接受整数的秩为1的数组，并返回整数的秩为1的数组，每个元素对应于输入数组中该元素的计数。如果该描述使您感到困惑，请忍受并阅读相当简单的代码：

FUNCTION get_counts(arr) RESULT(rslt)
  INTEGER, DIMENSION(:), INTENT(in) :: arr
  INTEGER, DIMENSION(SIZE(arr)) :: rslt
  INTEGER :: ix
  DO ix = 1, SIZE(arr)
     rslt(ix) = COUNT(arr(ix)==arr)
  END DO
END FUNCTION get_counts

对于输入数组[1,1,2,3,4,1,5]，它返回[3,3,1,1,1,3,1]。如果OP希望以此为基础来查看是否有n次出现的任何值，那么OP可以写

any(get_counts(rank_1_integer_array)==n)

如果OP担心知道n发生了什么元素，那么使用get_counts的结果返回原始数组以提取该元素非常简单。

此解决方案是务实的，因为它与 my 时间（而不是计算机时间）相比是简约的。我的解决方案有点浪费空间，这对于非常大的输入数组可能是个问题。 Rodrigo的任何解决方案在时间和空间上都可能超过我的产品。

Answer 4

我想问的意思是要确定是否连续重复任何值至少n次。为了弄清楚这一点，我选择使用C标准库中的qsort对每一行的副本进行排序，然后很容易找到每次运行的值的长度。

module sortrow
   use ISO_C_BINDING
   implicit none
   interface
      subroutine qsort(base, num, size, compar) bind(C,name='qsort')
         import
         implicit none
         integer base(*)
         integer(C_SIZE_T), value :: num, size
         procedure(icompar) compar
      end subroutine qsort
   end interface
   contains
      function icompar(p1, p2) bind(C)
         integer(C_INT) icompar
         integer p1, p2
         select case(p1-p2)
            case(:-1)
               icompar = -1
            case(0)
               icompar = 0
            case(1:)
               icompar = 1
         end select
      end function icompar
end module sortrow

program main
   use sortrow
   implicit none
   integer, parameter :: M = 3, N = 10
   integer i, j
   integer array(M,N)
   real harvest
   integer, allocatable :: row(:)
   integer current, maxMatch

   call random_seed
   do i = 1, M
      do j = 1, N
         call random_number(harvest)
         array(i,j) = harvest*3
      end do
   end do
   do i = 1, M
      row = array(i,:)
      call qsort(row, int(N,C_SIZE_T), C_SIZEOF(array(1,1)), icompar)
      maxMatch = 0
      current = 1
      do j = 2, N
         if(row(j) == row(j-1)) then
            current = current+1
         else
            current = 1
         end if
         maxMatch = max(maxMatch,current)
      end do
      write(*,'(*(g0:1x))') array(i,:),'maxMatch =',maxMatch
   end do
end program main

样品运行：

0 0 0 2 0 2 1 1 1 0 maxMatch = 5
2 1 2 1 0 1 2 1 2 0 maxMatch = 4
0 0 2 2 2 2 2 0 1 1 maxMatch = 5

无论如何，要检查矩阵的一行中n个词在fortran中是否相等？

4 个答案:

天真的解决方案＃1-好的'ol do loops

天真的解决方案＃2-尝试进行矢量化

天真的解决方案＃3-功能风格

结论