在抓取

时间:2017-06-25 04:04:25

标签: python string scrapy

我目前正在尝试搜索youtube播放列表。 废料有效,但我想只得到一部分标题。

例如:

  • 视频标题为:

      

    'et si on mangeait la connaissance? | Idriss Aberkane |   TEDxPanthéonSorbonne'

  • 通过抓取我只想得到:

      

    'et si on mangeait la connaissance?'

我想删除|

之后的所有字符

有可能吗?

4 个答案:

答案 0 :(得分:0)

import re

p = re.compile("(.*?) \|.*")
m = p.search('Et si on mangeait la connaissance? | Idriss Aberkane | TEDxPanthéonSorbonne')

这会提供您想要的字符串:

m[1]

答案 1 :(得分:0)

如果你确定" |"你可以在每个标题中写出这样的字符

string title = "test title | about anything";
string result ="";
if(title.indexOf("|") > -1)
    result = title.substring(0, test.indexOf("|"));

答案 2 :(得分:0)

如果您想在第一次出现' |'时删除所有内容你可以写下面的代码:

scrap_result = 'Et si on mangeait la connaissance? | Idriss Aberkane | TEDxPanthéonSorbonne' # this is the scrap result of the title you get you can user str() to be precise so you only get string is a title.
scrap_result = scrap_result[:scrap_result.find("|")] # this will give you result before the first occurrence of '|' but it includes trailing space at the end if you want to remove it use scrap_result.strip() 

答案 3 :(得分:-1)

是的,你有两个选择: 切片串

Private Function cc() As Bitmap
    Dim s As Screen = Screen.PrimaryScreen
    Dim img As New Bitmap(s.Bounds.Width, s.Bounds.Height)
    Dim gr As Graphics = Graphics.FromImage(img)
    gr.CopyFromScreen(s.Bounds.Location, Point.Empty, s.Bounds.Size)
    Return img
End Function

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
    Try
        Me.PictureBox1.Image = cc()
    Catch ex As Exception
        MsgBox(ex.Message)
    End Try
End Sub

更换:

String = 'Et si on mangeait la connaissance? | Idriss Aberkane | TEDxPanthéonSorbonne'
String = String[-1:-x]