我正在尝试构建一个HTML web-scraper,并且遇到了一个我无法通过的障碍。
#![feature(libc)]
#![feature(rustc_private)]
extern crate libc;
extern crate url;
extern crate hyper;
extern crate html5ever;
extern crate serialize;
extern crate html5ever_dom_sink;
#[macro_use]
extern crate tendril;
use tendril::{StrTendril, SliceExt};
use std::ffi::{CStr,CString};
use tendril::{ByteTendril, ReadExt};
use html5ever::{parse, one_input};
use html5ever_dom_sink::common::{Document, Doctype, Text, Comment, Element};
use html5ever_dom_sink::rcdom::{RcDom, Handle};
use hyper::Client;
use hyper::header::Connection;
use std::io::Read;
fn get_page(url: &str) -> String {
let mut client = Client::new();
let mut res = client.get(url)
// set a header
.header(Connection::close())
// let 'er go!
.send().unwrap();
let mut body = String::new();
res.read_to_string(&mut body).unwrap();
body
}
#[no_mangle]
pub extern fn parse_page(url: *const libc::c_char) {
let url_cstr = unsafe { CStr::from_ptr(url) }; // &std::ffi::c_str::CStr
let url_and_str = url_cstr.to_str().unwrap(); // &str
let body = get_page(url_and_str);
let body_tendril = body.to_tendril();
let body_tendril = body_tendril.try_reinterpret().unwrap();
let dom: RcDom = parse(one_input(body_tendril), Default::default());
// let c_body = CString::new(body).unwrap(); // std::ffi::c_str::CString
// c_body.into_ptr()
}
当我使用cargo
构建此lib时,我收到错误:
error: type mismatch resolving `<core::option::IntoIter<tendril::tendril::Tendril<_>> as core::iter::Iterator>::Item == tendril::tendril::Tendril<tendril::fmt::UTF8>`:
expected struct `tendril::tendril::Tendril`,
found a different struct `tendril::tendril::Tendril`
如何将身体字符串转换为解析所期望的正确种类的卷须?
答案 0 :(得分:6)
这表明您已经编译了tendril
个箱子的多个版本,并且您正在意外地尝试混合它们。确保取决于tendril
的所有内容取决于相同的tendril
。