How to return text from HTML without tag using python and BeautifulSoup?

时间:2017-08-30 20:28:45

标签: python beautifulsoup urllib

I am stuck trying to return text from a website. I am trying to return ownerId and unitId from the following example. Any help is greatly appreciated.

public class SingletonPublicApi
    public int GetGrade(string student)
        return SingletonDatabase.Instance.GetGrade(student);

public class SingletonTests
    public void SingletonStudentGradeTest()
        var mySingletonApi = new SingletonPublicApi();

1 个答案:

答案 0 :(得分:1)

You could use Beautiful Soup like so:

d = np.array(d).T
d_arr = np.random.uniform(d[0], d[1], d[2])

This will produce the following result:

ValueError: sequence too large; cannot be greater than 32

Also in this way you can access any other variable too, by doing [Activity(Label = "teste", Icon = "@drawable/icon", Theme = "@style/MainTheme", NoHistory = true, MainLauncher = true, ConfigurationChanges = ConfigChanges.ScreenSize | ConfigChanges.Orientation)] public class MainActivity : global::Xamarin.Forms.Platform.Android.FormsAppCompatActivity { protected override void OnCreate(Bundle bundle) { TabLayoutResource = Resource.Layout.Tabbar; ToolbarResource = Resource.Layout.Toolbar; base.OnCreate(bundle); global::Xamarin.Forms.Forms.Init(this, bundle); LoadApplication(new App()); } bool formOpen = false; void OnBackPressed(Bundle bundle) { if (formOpen == false) { base.OnBackPressed(); } else { } } } .


Now in case you have to deal with multiple #!/usr/bin/env python from bs4 import BeautifulSoup html = ''' <script> h1.config.days = "7"; h1.config.hours = "24"; h1.config.color = "blue"; h1.config.ownerId = 7321; h1.config.locationId = 1258; h1.config.unitId = "164"; </script> ''' soup = BeautifulSoup(html, "html.parser") jsinfo = soup.find("script") d = {} for line in jsinfo.text.split('\n'): try: d[line.split('=')[0].strip().replace('h1.config.','')] = line.split('=')[1].lstrip().rstrip(';') except IndexError: pass print 'OwnerId: {}'.format(d['ownerId']) print 'UnitId: {}'.format(d['unitId']) tags, to iterate through them you can do:

OwnerId:  7321
UnitId:   "164"

Now, d['variable'] is type of <script> which you can iterate through like a normal list.

Now to extract lat and lon you could simply do:

jsinfo = soup.find_all("script")

This will produce the following results:


By appending more values in <class 'bs4.element.ResultSet'> you can access some other variables too if you like!