我是处理xml文件的新手,我想知道如何使用vb将xml文件拆分为两个文件。
我对xml文件的主要问题是它太大而无法上传。我希望将它分成两部分可以解决我的问题。例如,当分成两个文件时,文件大小为34kb的xml将提供每个17KB的2个xml文件。
Dim doc As XDocument
doc = XDocument.Load("XMLSplit/Directory.xml")
' 1 grab the file size
' 2 divide file size by 2
' 3 find half way of the xml file
' 4 split into two
' 5 save split files as Directory1xml and Directory2.xml
Directory.xml
<Directory>
<Person>
<Name> John / </Name>
<age> 24 </age>
<DOB>
<year> 1990 </year>
<month> 03 </month>
<date> 23 </date>
</DOB>
</Person>
<Person>
<Name> Jane / </Name>
<age> 21 </age>
<DOB>
<year> 1993 </year>
<month> 04 </month>
<date> 25 </date>
</DOB>
</Person>
</Directory>
答案 0 :(得分:0)
您无需将文件视为XML。将文件作为纯文本处理应该没问题。您可以使用String.Substring方法获取字符串的一部分。一个简单的拆分算法可能如下所示:
将字符串分成相等部分(在这种情况下为2部分)的一种可能解决方案可以像这样实现(在这种情况下,要采用的长度是字符串长度的一半=两个相等的部分):
private function chunkify(byval source as string, byval length as integer) as List(of string)
dim chunks = new List(of string)
dim pos = 0
while (pos < source.Length)
dim toTake = length
if not (source.Length - pos) > length then
toTake = source.Length - pos
end if
chunks.Add(source.Substring(pos, toTake))
pos = pos + length
end while
return chunks
end function
调用字符串上的chunkify
,其中包含您希望每个部件具有的长度(您的部件位于包含字符串的列表中):
dim content = File.ReadAllText("d:\\xml.xml")
dim chunks = chunkify(content, content.Length / 2)
for each chunk in chunks
Console.WriteLine(chunk)
next chunk
您的内容输出为:
<?xml version="1.0"?>
<Directory>
<Person>
<Name> John / </Name>
<age> 24 </age>
<DOB>
<year> 1990 </year>
<month> 03 </month>
<date> 23 </date>
</DOB>
' here is the new line from the Console.WriteLine
</Person>
<Person>
<Name> Jane / </Name>
<age> 21 </age>
<DOB>
<year> 1993 </year>
<month> 04 </month>
<date> 25 </date>
</DOB>
</Person>
</Directory>
我建议你将XML转换为字节,然后将字节分成相等的部分(在这种情况下,取length / 2
),因为它可能适合传输。拆分功能的一种可能解决方案如下所示:
function chunkify(byval source as byte(), byval length as integer) as List(Of byte())
' result list containing all parts
dim chunks = new List(of byte())
' the first chunk of content
dim chunk = source.Take(length).ToArray()
do 'loop as long there is something in the array
chunks.Add(chunk)
' remove already read content
source = source.Skip(length).ToArray()
' is there more to take?
chunk = source.Take(length).ToArray()
loop while (chunk.Length > 0)
return chunks
end function
用法如下:
' read all bytes
dim content = File.ReadAllBytes("d:\\xml.xml")
' split into equal parts
dim chunks = chunkify(content, content.Length / 2)
' print / handle each part
for each chunk in chunks
Console.WriteLine(System.Text.Encoding.UTF8.GetString(chunk))
Console.WriteLine("==================================")
next chunk
使用您的示例XML,拆分后的输出符合预期:
<?xml version="1.0"?>
<Directory>
<Person>
<Name> John / </Name>
<age> 24 </age>
<DOB>
<year> 1990 </year>
<month> 03 </month>
<date> 23 </date>
</DOB>
==================================
</Person>
<Person>
<Name> Jane / </Name>
<age> 21 </age>
<DOB>
<year> 1993 </year>
<month> 04 </month>
<date> 25 </date>
</DOB>
</Person>
</Directory>
==================================