如何在此字符串中获取标记的href属性?

时间:2016-01-17 06:38:59

标签: c# html string

在这个字符串中存在数字li标签。我想获得标签的href属性,例如:

http://bipardeh94.blogfa.com" target="_blank

http://avaejam.blogfa.com" target="_blank

和...... 我希望用C#做到这一点。如何做到这一点? 我使用此代码但这并不完整。

int indexStartUl = _codeHtml.IndexOf("<ul");
            int indexEndUl = _codeHtml.IndexOf("</ul>");
            _codeHtml = _codeHtml.Substring(indexStartUl, indexEndUl);

请帮忙。

 <ul class="ull">
        <li><a href="http://bipardeh94.blogfa.com" target="_blank">باغ بلور</a><span class="ur">bipardeh94.blogfa.com</span><span class="ds">فرهنگی-خبری-علمی</span></li>
        <li><a href="http://avaejam.blogfa.com" target="_blank">هزار نکته </a><span class="ur">avaejam.blogfa.com</span><span class="ds"> يك نكته از هزار نكته  باشد تا بعد </span></li>
        <li><a href="http://prkangavar.blogfa.com" target="_blank">روابط عمومی دانشگاه آزاداسلامی کنگاور</a><span class="ur">prkangavar.blogfa.com</span><span class="ds">اخبار دانشگاه</span></li>
        <li><a href="http://bordekhoun.blogfa.com" target="_blank">وبلاگ اطلاع رسانی بردخون</a><span class="ur">bordekhoun.blogfa.com</span><span class="ds">اخباروگزارشات وتحلیل ها درباره بردخون</span></li>
        <li><a href="http://mahinvare.blogfa.com" target="_blank">تدوری های نوین</a><span class="ur">mahinvare.blogfa.com</span><span class="ds">نظریه های علوم انسانی باید متحول شود</span></li>
        <li><a href="http://zanjanuniversity.blogfa.com" target="_blank">دانشگاه زنجان</a><span class="ur">zanjanuniversity.blogfa.com</span><span class="ds">اخبار دانشگاهیان زنجان و دانشگاه آزاد زنجان و سیستم ثبت نام شهردای زنجان </span>
        </li>
    </ul>

4 个答案:

答案 0 :(得分:4)

您可以使用[indent=4] uses Gtk Gst Gee class AudioFilesSrc : Gst.Bin wavparse: Element src: Element audioconvert: Element srcpad: Pad sinkpad: Pad def OnDynamicPad (element:Element, zz:Pad) var opad = audioconvert.get_static_pad("sink"); zz.link(opad); def open(p:Gst.Bin,mixer:Element,s1:string) src = Gst.ElementFactory.make("filesrc", "src1"); wavparse = ElementFactory.make("wavparse","wavparse"); audioconvert = ElementFactory.make("audioconvert","audioconvert"); wavparse.pad_added.connect(OnDynamicPad); this.add_many(src,wavparse,audioconvert); src.link_many(wavparse,audioconvert); src.set("location",s1); // añade este bin a pipeline general p.add(this) //busca la salida de audioconvert1 y conviertela en la salida del bin srcpad = new Gst.GhostPad("src", audioconvert.get_static_pad("src")); this.add_pad(srcpad); print ".-.----------------------------- abierto "+s1 def conecta(mixer:Element) //crea una entrada al mixer sinkpad = mixer.get_request_pad("sink_%u") //this.get_pad("src").link(this.sinkpad) srcpad.link(this.sinkpad) //this.srcpad.set_blocked(false) print ".-.----------------------------- conectado" def close(p:Gst.Bin,mixer:Element) p.set_state(State.PAUSED) //this.srcpad.set_blocked(true) this.srcpad.add_probe(PadProbeType.BLOCK, cb) this.set_state(State.NULL) this.srcpad.unlink(this.sinkpad) mixer.release_request_pad (this.sinkpad) def cb (p:Pad, info:PadProbeInfo) : PadProbeReturn return PadProbeReturn.OK init Gtk.init (ref args) Gst.init (ref args); var prueba = new ventana () prueba.show_all () Gtk.main (); class ventana : Window drawing_area:private DrawingArea; videopipeline: private Pipeline ; recordbin: Gst.Bin videobin: Gst.Bin audiobin: Gst.Bin volume: private Element ; videosrc :private Element; videosink: private Element; videodec: private Element; vaudiosink: private Element; vaudioparser: Element; vaudiodec: Element; vaudioadder: private Element; vaudioarchivos:list of AudioFilesSrc recordsrc :private Element; recordsink: private Element; recordconvert: private Element; recordencoder: private Element; comienzo_grabacion:float xid :private ulong ; reloj:uint position: float duracion:float bus:Gst.Bus bus2:Gst.Bus msg:Gst.Message msg2:Gst.Message seek_enabled:bool seek_enabled2:bool scale_1:Scale estado:string lugar:int numgrab:int=0 archivos_audio:list of string button:Button button1:Button button2:Button button3:Button button4:Button button5:Button button6:Button button7:Button button8:Button init reloj = Timeout.add(1000, mover) title = "Bikoizketa" default_height = 250 default_width = 450 window_position = WindowPosition.CENTER comienzo_grabacion=-1 // video pipeline duracion=Gst.CLOCK_TIME_NONE; this.videopipeline = new Pipeline ("mypipeline"); videobin= new Gst.Bin("videobin") this.videosrc = ElementFactory.make ("filesrc", "filesrc2") this.videosrc.set("location","gontzal3.mp4"); this.videodec = ElementFactory.make ("decodebin", "dec"); this.videosink = ElementFactory.make ("xvimagesink", "videosink"); this.videosink.set("force-aspect-ratio",true) this.videodec.pad_added.connect(OnDynamicPad); this.videobin.add_many (videosrc,videodec,videosink) this.videosrc.link_many (videodec,videosink) this.videopipeline.add(videobin) this.audiobin= new Gst.Bin("audiobin") this.vaudioadder = ElementFactory.make("adder","mixer"); this.volume= ElementFactory.make("volume","volume"); this.volume.set_property("volume",0.5) // creamos un elemento de autoaudiosink (en lugar de alsaaudiosink) this.vaudiosink= ElementFactory.make("autoaudiosink","autoaudiosink"); this.audiobin.add_many (vaudioadder,volume,vaudiosink) vaudioadder.link_many(volume,vaudiosink) this.videopipeline.add(audiobin) //añadiendo archivos vaudioarchivos=new list of AudioFilesSrc vaudioarchivos.add (new AudioFilesSrc()) vaudioarchivos.last().open(audiobin,vaudioadder,"silencio.wav") vaudioarchivos.last().conecta(vaudioadder) numgrab++ this.recordbin= new Gst.Bin("recorder") // creamos un elemento de autoaudiosrc (en lugar de alsaaudiosrc) this.recordsrc= ElementFactory.make ("autoaudiosrc","autoaudiosrc") this.recordconvert=ElementFactory.make ("audioconvert","audioconvert") this.recordencoder = Gst.ElementFactory.make("wavenc", "encoder") this.recordsink= ElementFactory.make ("filesink","filesink") this.recordsink.set ("location","file.wav") this.recordbin.add_many (this.recordsrc,this.recordconvert, this.recordencoder, this.recordsink); this.recordsrc.link_many(this.recordconvert,recordencoder,recordsink) //videopipeline.add(recordbin) // creamos un boton con la siguiente etiqueta button = new Button.with_label ("grabar") button2 = new Button.with_label ("parar") button3 = new Button.with_label ("play") button4 = new Button.with_label ("pausar") var box= new Box (Gtk.Orientation.HORIZONTAL, 4) var box1= new Box (Gtk.Orientation.HORIZONTAL, 4) var box2= new Box (Gtk.Orientation.VERTICAL, 4) scale_1= new Scale.with_range (Orientation.HORIZONTAL,0,1000,0.1) scale_1.value_changed.connect(on_scale) this.drawing_area = new DrawingArea (); this.drawing_area.realize.connect(on_realize); // Une el evento de clic de raton con la funcion pulsado button.clicked.connect (on_grabar) button2.clicked.connect (on_parar) button4.clicked.connect (on_pausa) button3.clicked.connect (on_play) // si pulsamos la x de la barra saldrá del loop destroy.connect(Gtk.main_quit) // añade el boton a la ventana this.add(box2) box.add(button3) box.add(button2) box.add(button4) box.add(button) box2.pack_start (drawing_area, true, true, 0); box2.pack_start (box,false, false, 0); box2.pack_start (scale_1,false, true, 0); estado="STOP" bus = this.videopipeline.get_bus() bus.add_signal_watch() msg = bus.timed_pop_filtered (10,Gst.MessageType.STATE_CHANGED | Gst.MessageType.ERROR | Gst.MessageType.EOS ); bus.message.connect(on_msg) def on_msg(m:Gst.Message) if m.type== Gst.MessageType.STATE_CHANGED old_state:Gst.State; new_state:Gst.State; pending_state:Gst.State; m.parse_state_changed (out old_state, out new_state, out pending_state); if (m.src == this.videosink) // Remember whether we are in the PLAYING state or not: if (new_state == Gst.State.PLAYING) q : Gst.Query = new Gst.Query.seeking (Gst.Format.TIME); start:int64; end:int64; if ( this.videosink.query (q) ) q.parse_seeking (null, out this.seek_enabled, out start, out end); if seek_enabled pass if (new_state == Gst.State.READY) //print "STOP" pass if (new_state == Gst.State.PAUSED) print "PAUSE" q : Gst.Query = new Gst.Query.seeking (Gst.Format.TIME); start:int64; end:int64; if ( this.videosink.query (q) ) q.parse_seeking (null, out this.seek_enabled, out start, out end); if seek_enabled //print "enable cuando pause"+start.to_string()+"-"+end.to_string() pass def on_scale( )//cuando la escala se mueve por el usuario if seek_enabled if this.videopipeline.seek_simple(Gst.Format.TIME, SeekFlags.FLUSH| Gst.SeekFlags.ACCURATE, (int64)(scale_1.get_value()*Gst.SECOND)) print "moviendo video" else print "video no se puede mover" def mover():bool // cambiando el valor mientras play el video. //desconectamos la deteccion de valores cambiados de la escala para evitar que el programa crea que el usuario esta cambiando el valor. scale_1.value_changed.disconnect(on_scale) var format = Gst.Format.TIME position=0 if this.videopipeline.query_position(format, out position) scale_1.set_value(position/Gst.MSECOND/1000) duracion=0; if this.videopipeline.query_duration(format, out duracion) duracion=duracion >= 0 ? duracion/Gst.MSECOND/1000 : 0 if estado!="STOP" do scale_1.set_range(0,duracion) scale_1.value_changed.connect(on_scale) return true def on_realize() this.xid = (ulong)((Gdk.X11.Window)this.drawing_area.get_window()).get_xid(); def on_grabar (btn : Button) if estado=="STOP" or estado=="PAUSA" recordsink.set("location","grabacion"+(numgrab).to_string()+".wav") this.videopipeline.add(this.recordbin) this.videopipeline.remove(audiobin) print "grabando" button.set_label("parar") var xoverlay = this.videosink as Video.Overlay xoverlay.set_window_handle ((uint*)this.xid); this.videopipeline.set_state (State.PLAYING); estado="REC" var format = Gst.Format.TIME position: int64 if this.videopipeline.query_position(format, out position) comienzo_grabacion=(position/Gst.MSECOND/1000) else comienzo_grabacion=-1 else if estado=="REC" //parando la grabación estado="PLAY" button.set_label("grabar") videopipeline.set_state(State.PAUSED) this.videopipeline.remove(recordbin) this.recordbin.set_state(State.NULL) this.recordbin.send_event(new Event.eos()) this.videopipeline.add(audiobin) this.on_insertar() def on_insertar() if comienzo_grabacion!=-1 try Process.spawn_command_line_sync ("sox grabacion"+numgrab.to_string()+".wav grabacion_x.wav pad "+((int64)comienzo_grabacion).to_string()+"@0") Process.spawn_command_line_sync ("mv grabacion_x.wav grabacion"+numgrab.to_string()+".wav") except err:Error print err.message print "grabacion a añadir :"+numgrab.to_string() vaudioarchivos.add((new AudioFilesSrc())) vaudioarchivos.last().open(this.audiobin,this.vaudioadder,"grabacion"+numgrab.to_string()+".wav") vaudioarchivos.last().conecta(this.vaudioadder) numgrab+=1 def on_pausa (btn:Button) if estado=="PLAY" print "pausando" this.videopipeline.set_state (State.PAUSED); estado="PAUSA" def on_parar (btn : Button) if estado=="PLAY" or estado=="PAUSA" var xoverlay = this.videosink as Video.Overlay xoverlay.set_window_handle ((uint*)this.xid); this.videopipeline.set_state (State.READY); this.scale_1.set_value(0) estado="STOP" def on_play (btn : Button) if estado=="STOP" or estado=="PAUSA" print "tocando" var xoverlay = this.videosink as Video.Overlay xoverlay.set_window_handle ((uint*)this.xid); this.videopipeline.set_state (State.PLAYING); estado="PLAY" def OnDynamicPad (element:Element, zz:Pad) var opad = this.videosink.get_static_pad("sink"); zz.link(opad); Selenium功能:

WebDriver

您会找到IList<IWebElement> lis = driver.FindElements(By.CssSelector(".ull > li")); foreach (IWebElement li in lis) { string href = li.GetAttribute("href"); } WebElements标记的所有li标记为WebElement的子类ull,以及列表中的迭代并获取href属性。

答案 1 :(得分:3)

您可以使用Html Agility Pack

  

Html Agility Pack示例:

 HtmlDocument doc = new HtmlDocument();
 doc.Load("file.htm");
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
 {
    HtmlAttribute att = link["href"];
    att.Value = FixLink(att);
 }
 doc.Save("file.htm");
  

链接:

How to use HTML Agility pack

http://www.mikesdotnetting.com/article/273/using-the-htmlagilitypack-to-parse-html-in-asp-net http://www.codeproject.com/Articles/691119/Html-Agility-Pack-Massive-information-extraction-f

我希望这些信息能够提供帮助

答案 2 :(得分:1)

为了更好地理解

子字符串(A,B)

  • a:从哪里开始你的子串
  • b:子串的长度

在你的前任中你采取:

作为ul的起始索引

b作为ul的结束索引//错误b将是从字符串开始到结束的长度!

你需要做的是:

int c = b - a // (will give you the inner text length)

_codeHtml = _codeHtml.Substring(a,c);

答案 3 :(得分:0)

没有任何外部库或工具,请使用以下行:

var hrefs = html.Split(new[] { "href='" }, StringSplitOptions.RemoveEmptyEntries).Where(o => o.StartsWith("http")).Select(o => o.Substring(0, o.IndexOf("'")));

将为您提供包含所有href的数组,如下所示:

http://bipardeh94.blogfa.com
http://avaejam.blogfa.com
http://prkangavar.blogfa.com
http://bordekhoun.blogfa.com
http://mahinvare.blogfa.com
http://zanjanuniversity.blogfa.com

完整示例:this .net fiddle