如何获取其他行中位于注解关键字以下/上方的数据?我可以注释关键字,但无法获取信息
示例文字:
Underwriter's Name Appraiser's Name Appraisal Company Name
Alice Wheaton Cooper Bruce Banner Stark Industries
代码
TYPESYSTEM utils.PlainTextTypeSystem;
ENGINE utils.PlainTextAnnotator;
EXEC(PlainTextAnnotator, {Line});
ADDRETAINTYPE(WS);
Line{->TRIM(WS)};
REMOVERETAINTYPE(WS);
Document{->FILTERTYPE(SPECIAL)};
DECLARE UnderWriterKeyword, NameKeyword, UnderWriterNameKeyword;
DECLARE UnderWriterName(String label, String value);
CW{REGEXP("\\bUnderwriter") -> UnderWriterKeyword};
CW{REGEXP("Name")->NameKeyword};
(UnderWriterKeyword SW NameKeyword){->UnderWriterNameKeyword};
ADDRETAINTYPE(SPACE);
Line{CONTAINS(UnderWriterNameKeyword)} Line -> {
(CW SPACE)+ {-> MARK(UnderWriterName)};
};
REMOVERETAINTYPE(SPACE)
预期输出:
Underwriter's Name: Alice Wheaton Cooper
Appraiser's Name: Bruce Banner
Appraisal Company Name: Stark Industries
请建议是否可以在RUTA中使用?如果为true,如何获取数据?
答案 0 :(得分:0)
在Ruta中查看inlined rules。
您可以为一行定义条件,并在下一行内注释所需的信息:
Line{CONTAINS(UserName)} Line -> { //your logic here };
答案 1 :(得分:0)
TYPESYSTEM utils.PlainTextTypeSystem;
ENGINE utils.PlainTextAnnotator;
DECLARE Header;
DECLARE ColumnDelimiter;
DECLARE Cell(INT column);
DECLARE Keyword (STRING label);
DECLARE Keyword UnderWriterNameKeyword, AppraiserNameLicenseKeyword,
AppraisalCompanyNameKeyword;
"Underwriter's Name" -> UnderWriterNameKeyword ( "label" = "UnderWriter
Name");
"Appraiser's Name/License" -> AppraiserNameLicenseKeyword ( "label" =
"Appraiser Name");
"Appraisal Company Name" -> AppraisalCompanyNameKeyword ( "label" =
"Appraisal Company Name");
DECLARE Entry(Keyword keyword);
EXEC(PlainTextAnnotator, {Line,Paragraph});
ADDRETAINTYPE(WS);
Line{->TRIM(WS)};
Paragraph{->TRIM(WS)};
SPACE[3,100]{-PARTOF(ColumnDelimiter) -> ColumnDelimiter};
Line -> {ANY+{-PARTOF(Cell),-PARTOF(ColumnDelimiter) -> Cell};};
REMOVERETAINTYPE(WS);
INT index = 0;
BLOCK(structure) Line{}{
ASSIGN(index, 0);
Line{STARTSWITH(Paragraph) -> Header};
c:Cell{-> c.column = index, index = index + 1};
}
Header<-{hc:Cell{hc.column == c.column}<-{k:Keyword;};}
# c:@Cell{-PARTOF(Header) -> e:Entry, e.keyword = k};
DECLARE Entity (STRING label, STRING value);
DECLARE Entity UnderWriterName, AppraiserNameLicense, AppraisalCompanyName;
FOREACH(entry) Entry{}{
entry{ -> CREATE(UnderWriterName, "label" = k.label, "value" =
entry.ct)}<-{k:entry.keyword{PARTOF(UnderWriterNameKeyword)};};
entry{ -> CREATE(AppraiserNameLicense, "label" = k.label, "value" =
entry.ct)}<-{k:entry.keyword{PARTOF(AppraiserNameLicenseKeyword)};};
entry{ -> CREATE(AppraisalCompanyName, "label" = k.label, "value" =
entry.ct)}<-{k:entry.keyword{PARTOF(AppraisalCompanyNameKeyword)};};
}
最重要的规则如下:
Header<-{hc:Cell{hc.column == c.column}<-{k:Keyword;};}
# c:@Cell{-PARTOF(Header) -> e:Entry, e.keyword = k};
它包含三个规则元素Header
,#
和Cell
,并以此方式工作:
Cell
规则元素匹配,因为它已被@
标记为锚点。Cell
批注中或不属于其中的所有Header
批注中匹配。它从满足此条件的第一个Cell
注释开始,并将其称为“ c”。#
,它将匹配直到下一个规则元素能够匹配。Header
注释的范围内匹配,则下一个规则元素在Header
注释上匹配。内联规则在此范围内被记住为“ hc”的Cell
注释上与特征column
具有相同值的匹配。如果匹配包含记为“ k”的Keyword
,则匹配成功。Entry
注释的跨度上创建一个称为“ e”的Cell
注释。Entry
功能keyword
。作为摘要,该规则为不属于标题的每个Entry
注释创建一个Cell
注释,并分配相应列的header关键字以定义条目的类型