我在从文本文件中解析CSV时遇到问题,并且想知道你们是否可以帮助我。到目前为止,我有以下内容,
CSV文件(DATA.txt)看起来像这样,它总是有15个字段全部用逗号分隔。并非所有字段都是强制性的,因此有些字段会被填充,有些字段是空白的。
Seattle,Lastname,Firstname,DOB,SEX,etc,etc
Seattle,Lastname,Firstname,DOB,,etc,etc
Portland,Lastname,Firstname,DOB,SEX,,,etc
Portland,Lastname,Firstname,DOB,SEX,etc,etc
这是我的REXX代码
SOURCEFILE = "C:\DATA\DATA.TXT"
IF A=2 THEN DO COUNTER=1 TO LINES(SOURCEFILE)
PARSE VALUE LINEIN(SOURCEFILE) WITH CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
CALL SETCURSOR 4,23
CALL CREATEDATA
END
CREATEDATA:
CALL TYPE CITY
CALL PRESS TAB
CALL TYPE LAST_NAME
CALL PRESS TAB
CALL TYPE DATE(U)
CALL PRESS TAB
CALL TYPE FIRST_NAME
CALL PRESS TAB
CALL PRESS ENTER
RETURN
我不确定在解析时是否应该使用ARG或VAR,或者我是否正确地写了前两行。我知道我的CREATEDATA函数正常工作,因为我输入的是“CITY”而不是解析后的值。任何帮助将非常感谢。谢谢!
答案 0 :(得分:1)
一些评论:
Windows系统上的 1)Lines(SourceFile)
可能涉及读取整个文件以计算CR-LF序列。然后你的Parse value LineIn(SourceFile)
循环再次读取它。典型的Rexx方法是:
Address SYSTEM 'TYPE' SourceFile with output stem Lines.
Do Counter = 1 to Lines.0
Parse var Lines.Counter ...
End
Drop Lines.
至少,只要文件不是那么大,以至于将其保存在数组中的内存成本很高。
2)你在循环结束时流入CreateData
,这就是你看到“CITY”的原因。在Return
指令之后,您需要Exit
或End
。
3)鉴于#2,很明显Parse
永远不会被执行,因为City
未初始化(Rexx中未初始化的变量的值是大写的名称)。它以A=2
为条件,但情况并非如此。
答案 1 :(得分:1)
一个问题是什么,如果A = 2那么
IF A=2 THEN DO COUNTER=1 TO LINES(SOURCEFILE)
如果A!= 2,则绕过循环。我怀疑你的程序应该是:
SOURCEFILE = "C:\DATA\DATA.TXT"
DO COUNTER=1 TO LINES(SOURCEFILE)
PARSE VALUE LINEIN(SOURCEFILE) WITH CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
CALL SETCURSOR 4,23
CALL CREATEDATA
END
RETURN /* prevent the fall through to createdata */
CREATEDATA:
---------------------------
parse语句具有以下基本格式
解析[source] [parse-control]
其中[source] icludes
arg - 过程调用的参数 拉 - 数据从堆栈中拉出 var - 数据来自变量 值...使用内联提供的数据
所以你的解析可以像
那样完成 linein = LINEIN(SOURCEFILE)
PARSE var linein CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
或
DO COUNTER=1 TO LINES(SOURCEFILE)
CALL SETCURSOR 4,23
CALL CREATEDATA LINEIN(SOURCEFILE)
END
RETURN /* prevent the fall through to createdata */
CREATEDATA:
parse arg CITY "," LAST_NAME "," FIRST_NAME "," MOM_NAME "," MIDDLE_NAME "," DAD_NAME "," DOB "," etc "," etc "," etc "," etc "," SEX "," etc "," etc
最后,ass ross说你应该尝试和删除行(源文件),因为它涉及读取整个文件