我有一个ASCII文件,看起来像:
____________________________________________
Header1 ...
Header2 ...
Header3 ...
block(1)data1 block(1)data2 block(1)data3
block(1)data4 block(1)data5 block(1)data6
block(2)data1 block(2)data2 block(2)data3
block(2)data4 block(2)data5 block(2)data6
...
block(n)data1 block(n)data2 block(n)data3
block(n)data4 block(n)data5 block(n)data6
____________________________________________
我想将其转换为如下所示的ASCII文件:
____________________________________________
HeaderA ...
HeaderB ...
block(n)data1 block(n)data2 block(n)data3
block(n)data4 block(n)data5 block(n)data6
block(n-1)data1 block(n-1)data2 block(n-1)data3
block(n-1)data4 block(n-1)data5 block(n-1)data6
....
block(1)data1 block(1)data2 block(1)data3
block(1)data4 block(1)data5 block(1)data6
____________________________________________
数据主要是实数,并且数据集的大小太大,无法使用可分配的数组。因此,我可以通过某种方式即时读写。
我找不到在文件中向后读或写的方法。
答案 0 :(得分:0)
我不会直接使用Fortran,而是使用一系列Linux命令(或Windows上的Cygwin / GNU utils)。 Fortran也可以(请参见第二种可能性)。
概述(基于OS命令):
wc
)head
)到文件result file
tail
)awk
脚本运行结果tac
awk
脚本以分割行result file
另一个想法是(用编程语言):
ftell
的结果)。fseek
移至指定位置答案 1 :(得分:0)
使用可分配数组的大方法。
如果数据适合内存,则可以这样做。我已经测试过了,一个文件
header(1)
header(2)
header(3)
block(1).data1 block(1).data2 block(1).data3
block(1).data4 block(1).data5 block(1).data6
block(2).data1 block(2).data2 block(2).data3
block(2).data4 block(2).data5 block(2).data6
...
block(9999998).data1 block(9999998).data2 block(9999998).data3
block(9999998).data4 block(9999998).data5 block(9999998).data6
block(9999999).data1 block(9999999).data2 block(9999999).data3
block(9999999).data4 block(9999999).data5 block(9999999).data6
文件大小为1.2GB的这个awk脚本可以将其反转:
#!/usr/bin/awk
# if line contains word "header", print immediately, move on to next line.
/header/ {print; next}
# move every line to memory.
{
line[n++] = $0
}
# When finished, print them out in order n-1, n, n-3, n-2, n-5, n-4, ...
END {
for (i=n-2; i>=0; i-=2) {
print(line[i])
print(line[i+1])
}
}
在2分钟之内。
如果这实际上是不可能的,则需要执行@ high-performance-mark所说的操作,并以可管理的块形式读取它,在内存中将其反转,然后最后将它们连接在一起。这是我的版本:
program reverse_order
use iso_fortran_env, only: IOSTAT_END
implicit none
integer, parameter :: max_blocks_in_memory = 10000
integer, parameter :: max_line_length=100
character(len=max_line_length) :: line
character(len=max_line_length) :: data(2, max_blocks_in_memory)
character(len=*), parameter :: INFILE='data.txt'
character(len=*), parameter :: OUTFILE='reversed_data.txt'
character(len=*), parameter :: TMP_FILE_FORMAT='("/tmp/", I10.10,".txt")'
character(len=len("/tmp/XXXXXXXXXX.txt")) :: tmp_file_name
integer :: in_unit, out_unit, tmp_unit
integer :: num_headers, i, j, tmp_file_number
integer :: ios
! Open the input and output files
open(newunit=in_unit, file=INFILE, action="READ", status='OLD')
open(newunit=out_unit, file=OUTFILE, action='WRITE', status='REPLACE')
! Transfer the headers to the output file immediately.
num_headers = 0
do
read(in_unit, '(A)') line
if (index(line, 'header') == 0) exit
num_headers = num_headers + 1
write(out_unit, '(A)') trim(line)
end do
! We've already read the first data line, so let's rewind and start anew.
rewind(in_unit)
! move past the headers.
do i = 1, num_headers
read(in_unit, *)
end do
tmp_file_number = 0
! Read the data from the input line max_blocks_in_memory blocks at a time.
read_loop : do
do i = 1, max_blocks_in_memory
read(in_unit, '(A)', iostat=ios) data(1, i)
if (ios == IOSTAT_END) then ! Reached the end of the input file.
if (i > 1) then ! Still have final values in memory, write them
! to output immediately.
do j = i-1, 1, -1
write(out_unit, '(A)') trim(data(1, j))
write(out_unit, '(A)') trim(data(2, j))
end do
end if
exit read_loop
end if
read(in_unit, '(A)') data(2, i)
end do
! Reasd a block of data, write it in reverse order into a temporary file.
tmp_file_number = tmp_file_number + 1
write(tmp_file_name, TMP_FILE_FORMAT) tmp_file_number
open(newunit=tmp_unit, file=tmp_file_name, action="WRITE", status="NEW")
do j = max_blocks_in_memory, 1, -1
write(tmp_unit, '(A)') data(1, j)
write(tmp_unit, '(A)') data(2, j)
end do
close(tmp_unit)
end do read_loop
! Finished with input file, don't need it any more.
close(unit=in_unit)
! Concatenate all the temporary files in reverse order to the output file.
do j = tmp_file_number, 1, -1
write(tmp_file_name, TMP_FILE_FORMAT) j
open(newunit=tmp_unit, file=tmp_file_name, action="READ", status="OLD")
do
read(tmp_unit, '(A)', iostat=ios) line
if (ios == IOSTAT_END) exit
write(out_unit, '(A)') trim(line)
end do
close(tmp_unit, status="DELETE") ! Done with this file, delete it after closing.
end do
close(unit=out_unit)
end program reverse_order
答案 2 :(得分:0)
好吧,我有一个答案,但是它没有用,可能是由于编译器错误或我对Fortran中文件定位的基本了解。我的尝试是使用access = 'stream'
和form = 'formatted'
打开输入文件。这样,我可以将行位置推入堆栈,然后弹出它们,以便它们以相反的顺序出现。然后,以相反的顺序遍历这些行,我可以将它们写入ourput文件中。
program readblk
implicit none
integer iunit, junit
integer i, size
character(20) line
type LLnode
integer POS
type(LLnode), pointer :: next => NULL()
end type LLnode
type(LLNODE), pointer :: list => NULL(), current => NULL()
integer POS, temp(2)
open(newunit=iunit,file='readblk.txt',status='old',access='stream',form='formatted')
open(newunit=junit,file='writeblk.txt',status='replace')
do i = 1, 3
do
read(iunit,'(a)',advance='no',EOR=10,size=size) line
write(junit,'(a)',advance='no') line
end do
10 continue
write(junit,'(a)') line(1:size)
end do
do
inquire(iunit,POS=POS)
allocate(current)
current%POS = POS
current%next => list
list => current
read(iunit,'()',end=20)
end do
20 continue
current => list
list => current%next
deallocate(current)
do while(associated(list))
temp(2) = list%POS
current => list%next
deallocate(list)
temp(1) = current%POS
list => current%next
deallocate(current)
do i = 1, 2
write(*,*) temp(i)
read(iunit,'(a)',advance='no',EOR=30,size=size,POS=temp(i)) line
write(junit,'(a)',advance='no') line
do
read(iunit,'(a)',advance='no',EOR=30,size=size) line
write(junit,'(a)',advance='no') line
end do
30 continue
write(junit,'(a)') line(1:size)
end do
end do
end program readblk
这是我的输入文件:
Header line 1
Header line 2
Header line 3
1a34567890123456789012345678901234567890
1b34567890123456789012345678901234567890
2a34567890123456789012345678901234567890
2b34567890123456789012345678901234567890
3a34567890123456789012345678901234567890
3b34567890123456789012345678901234567890
现在使用ifort
,我的文件位置被打印为
214
256
130
172
44
88
请注意,第一行位于记录3的末尾,而不是记录4的开始。输出文件为
Header line 1
Header line 2
Header line 3
3a34567890123456789012345678901234567890
3b34567890123456789012345678901234567890
2a34567890123456789012345678901234567890
2b34567890123456789012345678901234567890
1a34567890123456789012345678901234567890
使用gfortran,文件位置打印为
214
256
130
172
46
88
这一次,正如我所期望的,第一行位于记录4的开头。但是,输出文件中包含不幸的内容
Header line 1
Header line 2
Header line 3
3a34567890123456789012345678901234567890
3b34567890123456789012345678901234567890
2a34567890123456789012345678901234567890
2b34567890123456789012345678901234567890
3a34567890123456789012345678901234567890
3b345678901234567890123456789012341a34567890123456789012345678901234567890
我希望有一个更积极的结果。我无法确定结果是否是由于不良的编程或编译器错误所致,但我发布了消息,以防别人可能使我的纯Fortran解决方案正常工作。