从HTML键值获取数据

时间:2017-12-30 17:11:40

标签: html swift key-value

我想从HTML获取密钥值。 目前使用快速汤通过索引获取元素。 如果网站元素索引发生变化,我无法获得该值。 想要根据密钥检索值 下面是HTML示例:

<table class="Tbl"><tbody><tr><td colspan="4" style="background-color:#D49F3E"><div class="Val" style="text-align:center;color:#540000">Saturday, December 30, 2017</div></td></tr><tr><td style="width:18%"><div class="Key">Sunrise:</div></td><td style="width:32%"><div class="Val">06:30</div></td><td style="width:18%"><div class="Key">Sunset:</div> </td><td style="width:32%"><div class="Val">17:52</div></td></tr>

我现在的代码是

do{
   let htmlContent = html
    do{
        let doc =  try SwiftSoup.parse(htmlContent as String)
        do{
            let element = try doc.select("[class=Val]").array()
            do{
                let today = try element[0].text()
                let Sunrise = try element[1].text()
                print(today)
                print(Sunrise)
            }catch{
            }
        }catch{
        }
    }catch{
    }
}

请建议如何按键获取阀门

EDIT-1: 试图申请。下面的代码和获取错误。有什么建议吗?

let data = try? Data(contentsOf: url! as URL)
        let html = NSString(data: data!, encoding: String.Encoding.utf8.rawValue)!

        do{
            let html1: String = html as String
            let els: Elements = try SwiftSoup.parse(html1).getElementsByClass("Tbl")
            for keyVal: Element in els.array(){
                let keyValText: String = try keyVal.text()
                print(keyValText)
                let components = keyValText
                    .components(separatedBy: .newlines)
                    .filter{
                        !$0.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty
                }
                if let index = components.index(where: {$0.contains(find: "Sunrise")}){
                    let sunriseTest = components[components.index(after: index)] //Error : Thread 1: Fatal error: Index out of range
                    print("sunriseTest: ", sunriseTest)
                }else{

                }
            }

        }catch Exception.Error(let type, let message){
            print(message)
        }catch{
            print("error")
        }

1 个答案:

答案 0 :(得分:0)

您可以将html转换为字符串,并将组件分隔为newLines。修复指针问题只是在修剪whiteSpaces后过滤掉空行:

R.drawablw.ka

游乐场测试:

struct Sun {
    let date: String
    let sunset: String
    let sunrise: String
    init?(string: String) {
        guard string.count > 4 else { return nil }
        let components = string
            .components(separatedBy: .newlines)
            .filter {
                !$0.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty
        }
        date = components[0]
        sunset = components[2]
        sunrise = components[4]
    }
    init?(data: Data) throws {
        self.init(string: try NSAttributedString(data: data, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil).string)
    }
}
let html1 = """
<table class="Tbl"><tbody><tr><td colspan="4" style="background-color:#D49F3E"><div class="Val" style="text-align:center;color:#540000">Saturday, December 30, 2017</div></td></tr><tr><td style="width:18%"><div class="Key">Sunrise:</div></td><td style="width:32%"><div class="Val">06:30</div></td><td style="width:18%"><div class="Key">Sunset:</div> </td><td style="width:32%"><div class="Val">17:52</div></td></tr>
"""
let html2 = """
<table class="Tbl"><tbody><tr><td colspan="4" style="background-color:#D49F3E"><div class="Val" style="text-align:center;color:#540000">Saturday, December 30, 2017</div></td></tr><tr><td style="width:18%"><div class="Key">Sunrise:</div></td><td style="width:32%"><div class="Val">06:30</div></td><td style="width:18%"><div class="Key">&nbsp;</div></td><td style="width:32%"><div class="Val">&nbsp;</div></td><td style="width:18%"><div class="Key">Sunset:</div> </td><td style="width:32%"><div class="Val">17:52</div></td></tr>
"""
  

今天:2017年12月30日星期六日落:06:30日出:17:52