下面的程序可以根据某些规范生成随机数据(此处示例为2列)
它可以在我的电脑上使用几十万行(应该依赖于RAM)。我需要扩展到数十万行。
如何优化程序直接写入磁盘?另外如何“缓存”解析规则的执行,因为它总是重复5000万次的相同模式?
注意:要使用下面的程序,只需输入generate-blocks,save-blocks输出就是db.txt
Rebol[]
specs: [
[3 digits 4 digits 4 letters]
[2 letters 2 digits]
]
;====================================================================================================================
digits: charset "0123456789"
letters: charset "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
separator: charset ";"
block-letters: [A B C D E F G H I J K L M N O P Q R S T U V W X Y Z]
blocks: copy []
generate-row: func[][
Foreach spec specs [
rule: [
any [
[
set times integer! [['digits (
repeat n times [
block: rejoin [block random 9]
]
)
|
'letters (repeat n times [
block: rejoin [ block to-string pick block-letters random 24]
]
)
]
|
[
'letters (repeat n times [block: rejoin [ block to-string pick block-letters random 24]
]
)
|
'digits (repeat n times [block: rejoin [block random 9]]
)
]
]
|
{"} any separator {"}
]
]
to end
]
block: copy ""
parse spec rule
append blocks block
]
]
generate-blocks: func[m][
repeat num m [
generate-row
]
]
quote: func[string][
rejoin [{"} string {"}]
]
save-blocks: func[file][
if exists? to-rebol-file file [
answer: ask rejoin ["delete " file "? (Y/N): "]
if (answer = "Y") [
delete %db.txt
]
]
foreach [field1 field2] blocks [
write/lines/append %db.txt rejoin [quote field1 ";" quote field2]
]
]
答案 0 :(得分:2)
使用open with / direct和/ lines refinement直接写入文件而不缓冲内容:
file: open/direct/lines/write %myfile.txt
loop 1000 [
t: random "abcdefghi"
append file t
]
Close file
这将写入1000个随机行而不进行缓冲。 你也可以准备一个行块(比如10000行),然后直接写入文件,这比逐行写入要快。
file: open/direct/lines/write %myfile.txt
loop 100 [
b: copy []
loop 1000 [append b random "abcdef"]
append file b
]
close file
这会快得多,100000行不到一秒钟。 希望这会有所帮助。
请注意,您可以根据需要更改数字100和1000的内存,并使用b:make block! 1000而不是b:copy [],它会更快。