使用正则表达式来捕获某个范围内包含的HTML时出现问题。
尝试让它在NameMC.com上获得safeytrfyh is available!
以制作快速检查程序,如果用户名可用,则会检查预先指定的列表,而不是经常输入用户名并单击检查。
你们可以使用的示例页面是https://namemc.com/u/safeytrfyh
我正在使用cURL:
<?php
//Urls to scrape from.
$URLs = array();
$URLs[] = 'https://namemc.com/u/safeytrfyh';
$working = '';
//Curl scraper.
foreach($URLs as $URL){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $URL);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$page = curl_exec($ch);
$accounts = array();
preg_match_all('#<div><span[^>]*>(.*?)</span></div>#',$page,$accounts);
foreach($accounts[0] as $account){
$working .= ''.$account.''. PHP_EOL . '';
}
}
//Put the scraped check into the new .txt file.
file_put_contents('accounts.txt', $working, FILE_APPEND);
?>
答案 0 :(得分:0)
通常更简单/效率更低的方法通常是使用整齐的前端遍历HTML结构,例如QueryPath等qp($html)->find(".alert-danger .alert-link")->text()
。虽然实际上对于具体任务看起来不太可靠。
现在,如果出于某种原因,您不想查看HTML源代码,调整正则表达式,或者不知道占位符如何工作;那么一个更简单的选择就是匹配原始文本:
$text = strip_tags($html);
preg_match_all("/(\w+) \s+ is \s+ available/x", $text, $matches);
其中\w+
代表单词字符,\s+
代表空格,/x
代表可读性。
答案 1 :(得分:0)
You can convert page in to DOM object can get what ever you want as:
<?php
$url = "http://stackoverflow.com/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); // if page is https (use if you are using local host)
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$page = curl_exec($ch); // Can echo to check page
$dom = new DOMDocument();
@$dom->loadHTML($page);
$xpath = new DOMXPath( $dom );
$query21 = '//div[@id="question-mini-list"]//h3//a[@class="question-hyperlink"]' ;
$nodes21 = $xpath->query( $query21 );
$title = "questions.txt";
$file_title = fopen($title, 'w');
foreach( $nodes21 as $node21 )
{
$tit = trim($node21->nodeValue); // HEADING
fwrite($file_title, $tit . "\r\n");
}
?>
OUTPUT as:
I have an araay in one file and i want to find the size of it in another file using “sizeof” , i dont want to use any extra variables?
No Activity found to handle Intent act=android.intent.action.VIEW when trying to play an audio file
Naming a variable dynamically in Ruby
uanble to use Bootstrap Notify with angular js in mvc application
How do I combine a bootstrap carousel with a sidebar menu?
Stop pausing when mouse hover -Full Slider
How to Let Recordset #2 in the Same Position as the Similar Recordset#1
Bash backup script. Read list of files. [OS X]
extracting multiple columns from mt0 in hspice simulations using awk command
Can't invoke *method=* type methods in instance_eval
Couldnt understand the Array behavior in ruby
Could not connect to sql server using msado15.dll in c++
Slick 3.0.0 AutoIncrement Composite Key
Swift Error type 'usersVC' does not conform to protocol 'UITableViewDataSource'
Installation Error Unknown Failure
Is it possible to 'emulate' a regular post that loads a new page in angularjs? or plain java as a backup?
puppet file protocol handle throws Could not evaluate
Hazard of load address in mips
how to post multipal files to a url from jscript?
How to organize the viewmodel of tableview with section in reactiveUI
CQRS with legacy MSSQL database
Should I use Blob storage or Azure VM storage for files?
Copy cell content from a column to another column in matlab
How do I debug a crash on iOS device from a crash log
Combobox in windows phone 8.1 not showing 4th and 5th element in emulator
How to add padding in printing table in F#?
I don't understand the SpriteAccessor class (Universal Tween Engine)
mule reliable pattern with file streaming and JMS
How to tell Faraday to preserve hashbang in site URL?
maven-license-plugin by mycila (replacing license header)
Customise `JOptionPane.YES_NO_OPTION`
AWS: Boto SQS writing isn't saving
Android expandable listview always scrolls down to bottom
Inconsistency in TypeConverter behavior?
Using function as prototype
Adjust width of inline buttons automatically based on parent width
GetWeek of Month, Week starts from Monday
Has anybody tried to recreate UITableViewController with static cells?
Why shows --“cannot pass objects of non-trivially-copyable type”?
Search and update a string in a text file in JAVA
What is Countdown Latch in Java MultiThreading?
Slim Framework with ORM (Eloquent) connect multiple db
Why isn't the frame centred in this GUI program when it is run?
Custom Logout Handler Not Working Grails
Response to post request to AWS “breaks the pipe”, cannot read
how to set focus to a SearchBox control in windows 8.1 store app?
Removing a word from after a string
need to generate css from scss file on windows 8.1 using gruntjs compass
Arduino YUN - complex JSON response
How to use expandable list view in the following scenario
Unique DB entry to the user
R : Save big objects to disk then only load parts of them
What is wrong based on these dbus system bus log files?
NLP Shift reduce parser is throwing null pointer Exception for Sentiment calculation
Excel VBA - Combine Rows with duplicate values, merge cells if different
what's TransactionID and RowID and Roll Point size in InnoDB
File associations in vscode
Difference Between IEnumerable Model and Model
efficient way of passing Data between Matlab functions
Open new Form in same window silverlight app via c#?
Hibernate configuration to create hbm and POJO
FTP Client gives “ECONNREFUSED - Connection refused by server”
Timer in Selective Repeat ARQ
Can TXL be used for code clone detection
MATLAB - Callback after reparenting
Asynchronous execution with datastax mapper
Stopping gobbler threads in blocking reads on Process InputStream
how to get gabor filter image using opencv?
WebView shows source html with loadDataWithBaseURL, not rendered view
git merge forked repo to local repo
Scrapy (Python): Iterating over 'next' page without multiple functions
android:uiOption=“SplitActionBarWhenNarrow” does not work
md5 hash a large file incrementally?
Instagram relationship request endpoint registration issue
cuda calc distance of two points
How to share contents of ListView row on facebook in Android?
how will the socket act when the receiving speed is larger than process speed
cannot see particle (cocos2d-x 3.5 with Particle Designer2)
Couldn't find FoodObject without an ID
CardView and RecyclerView divider behaviour
Verification google play purchase from server side
dyld: Symbol not found: _iconv when using javac to compile on MacOS
R not producing a figure in jupyter (IPython notebook)
Entity Framework 6 update a table and insert into foreign key related tables
I have integrated CLIPS with VC++(MFC), why there are some function does't execute,such as “strcmp”
Using SelectBoxIt in AngularJS Directive
Where is the Google Information Rights Management API?
Open Graph in Laravel 5
CodeIgniter 3 Unable to locate the model you have specified
how to have a static url for shopify oauth?
Use AnnotationReader under namespace
No such .h file or directory(Android, Cocos2d-x, NDK)
Getting total sum of rows and adding and removing rows using knockoutjs
Dynamic default value for Kendo Grid
Ruby's class expression---how is it different from `Class.new`?
socket.emit is not working in mobile chrome (but it works in incognito mode)