两个HTML标记之间的流写入

时间:2014-05-19 23:15:55

标签: html vb.net streamreader streamwriter

如果这个例子我首先遍历文件目录以获取尚未重命名的所有文件,我这样做是通过识别当前年份的文件(因为它是用时间戳生成的)一旦完成,它将这些文件提取到一个临时目录,从那里我想读取文件中的索引页面并将其写入文本文件,以便我可以根据索引的信息重命名以前提取的zip。我遇到了麻烦,因为我无法提出足够准确的逻辑来隔离我想要提取的HTML部分。我所知道的是 1.信息位于第二对标签之间,标签后面的前三个单词是“人员记录”。任何帮助隔离和写入此标签都将非常感激。

-Method我已经尝试过完全剥离所有html但是我发现它是不一致和繁琐的

Option Explicit On
Option Strict Off

Imports System
Imports System.Text
Imports System.IO
Imports System.Xml
Imports System.Diagnostics
Imports System.IO.Compression

模块Module1

Sub Main()

    Dim year As String
    year = Date.Today.Year
    Dim sc As New Shell32.Shell()
    'Dim EFMD As String() = Directory.GetFiles("C:\Users\Pepper\Desktop\In")
    Dim di As DirectoryInfo = New DirectoryInfo("C:\Users\Pepper\Desktop\In")
    For Each fi In di.GetFiles("*" + year + "*")
        'Dim startPath As String = "c:\example\start"
        Dim zipPath As String = "C:\Users\Pepper\Desktop\In\" + fi.ToString
        Dim extractPath As String = "C:\Users\Pepper\Desktop\Out\" + fi.ToString + "\"
        ZipFile.ExtractToDirectory(zipPath, extractPath)

        Console.WriteLine(fi)
    Next
    Console.WriteLine()

    Dim di_t As DirectoryInfo = New DirectoryInfo("C:\Users\Pepper\Desktop\Out")
    For Each fi In di_t.GetFiles("*" + year + "*")
        Dim g As String = "C:\Users\Pepper\Desktop\OUT\" + fi.ToString + "\INDEX.HTM"
        Dim h As String = "C:\Users\Pepper\Desktop\OUT\" + fi.ToString + "\INDEX" + fi.ToString + ".TXT"
        Dim sw As StreamWriter
        Dim sr As StreamReader = New StreamReader(g)
        sw = New StreamWriter(h)
        Dim line As String
        Do
            line = sr.ReadLine
            sw.WriteLine(line)
        Loop Until line.Trim = 'need logic here
        sw.WriteLine(line)
        sr.Close()
    Next
End Sub

结束模块 '解压缩确认人员身份的文件,一旦验证了名称和人员ID的重命名源文件。 '返回并删除提取的文件,将文件放在临时文件夹中 '读取/写入文件,直到我们获得人名,因此我们可以使用它来重命名

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:xsi="http://www.w3.org/2001/XMLSchema-  instance" xmlns:voc="urn:hl7-org:v3/voc" xmlns:n2="urn:hl7-org:v3/meta/voc"       xmlns:n1="urn:hl7-org:v3">
<head>
<title>Person Identification Sheet</title>
<style type="text/css">
        body
        {
            font-family: Arial, Helvetica, sans-serif;
            font-size: 12px;
            color: black;
        }
        td
        {
            font-size: 12px;
        }
        h2
        {
            font-size: 14px;
        }

        .dHeader
        {
            background-color: #FFFFFF;
        }
        .dFooter
        {
            background-color: #e4e7f1;
        }
        .dSectionTitle
        {
            font-weight: bold;
            font-size: 14px;
        }
        .dTable
        {
            border: 0px #ffffff solid;
            border-collapse: collapse;
        }
        .dTableHeading
        {
            font-weight: bold;
            font-size: 12px;
            text-align: left;
        }
        .dTableHeadingCell
        {
            padding-right: 20px;                
        }
        .dTableRow0
        {
            background-color: #f6f6f6;
        }
        .dTableRow1
        {
        }
        .dTableCell
        {
            padding-right: 20px;                
        }
        .pHeader
        {
            background-color: #2f3c6e;
        }
        .pHeaderLabel
        {
            font-weight: normal;
            color: white;
        }
        .pHeaderValue
        {
            font-weight: bold;
            color: white;
        }
        .pHeaderName
        {
            font-size: 22px;
            font-weight: bold;
            color: white;
        }


    </style>
</head>
<body>
<h2> Person record for  Jon D. Doe  ( PID:  2813308004 )  </h2>
<b>  Gender: </b>Male<b>  DOB: </b>June 10, 2011<br />

0 个答案:

没有答案