Question

我有一个file.txt，例如follow：

1. 0. 3.21
1. 1. 2.11
1. 2. 1.554
1. 0. 3.21
1. 3. 1.111
1. 2. 1.554

正如你所看到的，我有两条线彼此相等（第一和第四，第三和第六）。我的尝试是消除等于获得类似的行：

1. 0. 3.21
1. 1. 2.11
1. 2. 1.554
1. 3. 1.111

我尝试这样做的Fortran程序是：

        program mean
        implicit none
        integer :: i,j,n,s,units
        REAL*8,allocatable::  x(:),y(:),amp(:)

            ! open the file I want to change

            OPEN(UNIT=10,FILE='oldfile.dat')
            n=0
            DO
              READ(10,*,END=100)          
              n=n+1
            END DO

     100     continue
             rewind(10)
        allocate(x(n),y(n),amp(n))
    s=0

       ! save the numbers from the file in three different vectors

        do s=1, n
          read(10,*) x(s), y(s),amp(s)
        end do
       !---------------------!

    ! Open the file that should contains the new data without repetition       
    units=107
    open(unit=units,file='newfile.dat')

    ! THIS SHOULD WRITE ONLY NOT EQUAL ELEMENTS of THE oldfile.dat:
    ! scan the elements in the third column and write only the elements for which
    ! the if statement is true, namely: write only the elements (x,y,amp) that have
    ! different values in the third column. 

    do i=1,n
      do j = i+1,n
        if (amp(i) .ne. amp(j)) then ! 
         write(units,*),x(j),y(j),amp(j)
        end if
      end do
    end do   
   end program

但是输出文件看起来像这样：

   1.000000       1.000000       2.110000    
   1.000000       2.000000       1.554000    
   1.000000       3.000000       1.111000    
   1.000000       2.000000       1.554000    
   1.000000       2.000000       1.554000    
   1.000000      0.0000000E+00   3.210000    
   1.000000       3.000000       1.111000    
   1.000000       2.000000       1.554000    
   1.000000      0.0000000E+00   3.210000    
   1.000000       3.000000       1.111000    
   1.000000       3.000000       1.111000    
   1.000000       2.000000       1.554000    
   1.000000       2.000000       1.554000

我不明白if条件的问题是什么，你能帮我一点吗？

非常感谢！

Answer 1

我不会修复你的方法我会完全放弃它。您所拥有的是O(n^2)算法，适用于少量行，但在10^5行，您将执行if语句0.5 * 10^10倍。 Fortran的速度很快但不必要的浪费。

我首先对文件进行排序（O(n log n)），然后扫描它（O(n)）并消除重复项。我可能不会使用Fortran对其进行排序，我会使用其中一个Linux实用程序，例如sort。然后我可能会使用uniq，最终根本没有进行Fortran编程。

如果您想以原始顺序写出重复数据删除文件，那么我会添加一个行号，然后排序，唯一化，然后重新排序。

我认为最新版本的Windows（支持Powershell的版本）具有相同的命令。

如果我绝对不得不在Fortran中做这一切，我会写一个排序程序（或者说，从我的技巧包中取出一个）并继续使用它。我倾向于将这些线条作为字符串阅读并以文本方式对其进行排序，而不是乱用乱七八糟的事实及其棘手的平等概念。对于10^5行，我将整个文件读入一个数组，将其排序到另一个数组中，然后继续。

最后，我认为你的if陈述的逻辑是不可靠的。它仅基于第三个字段的相等（或不相等），即amp来决定是否将行写入新文件。它肯定会考虑i和j行上的所有三个字段，更像是

if ( any( [ x(i)/=x(j), y(i)/=y(j), amp(i)/=amp(j) ] ) ) then

Answer 2

只是要修复暴力循环，它应该是这样的：

do i=1,n
  j=1
  do while( j.lt.i.and.amp(i) .ne. amp(j))
    j=j+1
  enddo
  if(j.eq.i)write(units,*)x(i),y(i),amp(i)
end do

或

do i=1,n
  do j=1,i-1
   if ( amp(i) .eq. amp(j) ) exit
  enddo
  if(j.eq.i)write(units,*)x(i),y(i),amp(i)
end do

如何消除文件中的相等行

2 个答案: