我有this spreadsheet。我想从中生成一些xml清单。
以下是电子表格的一部分:
这是要生成的xml,名称为“mst-5.3_tmp.xml”(基于该部分的文件名)
<?xml version="1.0" encoding="iso-8859-1"?>
<activity type='cxp:jsp'>
<handler>mindtap_mastery</handler>
<!-- Section 5.3 Mastery -->
<group threshold="1" name="Energy and Temperature Change to Specific Heat">
<items>
<item src="owms01h/gen.question.32027" title="Mastery Item 1"/>
</items>
</group>
<group threshold="3" name="Specific Heat to Energy or Temperature">
<items>
<item src="owms01h/gen.question.32040" title="Mastery Item 1"/>
<item src="owms01h/gen.question.32041" title="Mastery Item 2"/>
<item src="owms01h/gen.question.32046" title="Mastery Item 3"/>
<item src="owms01h/gen.question.32048" title="Mastery Item 4"/>
</items>
</group>
<group threshold="2" name="Thermal Equilibrium">
<items>
<item src="owms01h/gen.question.32378" title="Mastery Item 1"/>
<item src="owms01h/gen.question.32380" title="Mastery Item 2"/>
</items>
</group>
<group threshold="2" name="Phase Change Energetics">
<items>
<item src="owms01h/gen.question.3737" title="Mastery Item 1"/>
<item src="owms01h/gen.question.3741" title="Mastery Item 2"/>
<item src="owms01h/gen.question.3752" title="Mastery Item 3"/>
<item src="owms01h/gen.question.3753" title="Mastery Item 4"/>
</items>
</group>
<group threshold="2" name="Heating Curves - Calculations">
<items>
<item src="owms01h/gen.question.5640" title="Mastery Item 1"/>
<item src="owms01h/gen.question.5641" title="Mastery Item 2"/>
<item src="owms01h/gen.question.5642" title="Mastery Item 1"/>
<item src="owms01h/gen.question.5643" title="Mastery Item 2"/>
</items>
</group>
</activity>
我的目标是将电子表格导出为制表符分隔的文本文件,并使用AWK创建xml。如果“Section”列中存在值,则应创建新文件。相邻的“指令单元”列包含第一个“group”元素的名称。该组的“items”以相邻“Geyser Item Name”列中的条目开头。如果下一行没有“Section”或“Instructional”单位“值,然后它应作为项目添加到当前组。如果有”教学单位“值,但没有”部分“,则应创建一个新组。等等。
我不确定如何开始和结束新文件,以及如何让AWK跳过上面控件中描述的列/行。
到目前为止,我所拥有的只是一个脚本,它创建了一个文件,其嵌套接近但不完全是我上面描述的内容。
#!/bin/bash
awk -F "\t" '{
if ($2) {
print "</items>";
print "</group>";
print "</activity>";
print "<?xml version=\"1.0\" encoding=\"iso-8859-1\"?>"
print "<activity type='cxp:jsp'>";
print "<handler>mindtap_mastery</handler>";
print "<!--" $2 "-->";
}
if ($3) {
print "<group threshold=\"1\" name=\"" $3 "\">";
print "<items>";
print "<item src=\"owms01h/" $4 "\" title=\"Mastery Item 1\"/>";
} else {
print "<item src=\"owms01h/" $4 "\" title=\"Mastery Item 1\"/>";
}
}' 'Media Grid_Units 1-5.txt' >> master.xml
答案 0 :(得分:1)
您可以将其保存为somefile.awk
,并使用awk -F"\t" -f somefile.awk spreadsheet.tab
NR==1 || !$4 {next} # Skip the header and blank lines
$2 { # New section
if (printingitems) { # close tags
print "</items>" >> filename;
print "</group>" >> filename;
print "</activity>" >> filename;
}
# Build new filename
split($2, part, " ");
filename = "mst-"part[2]"_tmp.xml";
print "<?xml version=\"1.0\" encoding=\"iso-8859-1\"?>" >> filename;
print "<activity type='cxp:jsp'>" >> filename;
print "<handler>mindtap_mastery</handler>" >> filename;
print "<!--" $2 "-->" >> filename;
printingitems = 0;
}
$3 { # New group
if (printingitems) {
print "</items>" >> filename;
print "</group>" >> filename;
}
groupname = substr($3, 5, length($3));
print "<group threshold=\"1\" name=\"" groupname "\">" >> filename;
print "<items>" >> filename;
printingitems = 1;
}
{ # new item
print "<item src=\"owms01h/" $4 "\" title=\"Mastery Item "printingitems++"\"/>" >> filename;
}
END { # this assumes all non-blank lines will have an item
print "</items>" >> filename;
print "</group>" >> filename;
print "</activity>" >> filename;
}