preg_match_all正则表达式

时间:2015-05-24 03:33:02

标签: php regex

使用正则表达式来捕获某个范围内包含的HTML时出现问题。 尝试让它在NameMC.com上获得safeytrfyh is available!以制作快速检查程序,如果用户名可用,则会检查预先指定的列表,而不是经常输入用户名并单击检查。

你们可以使用的示例页面是https://namemc.com/u/safeytrfyh 我正在使用cURL:

<?php
//Urls to scrape from.
$URLs = array();
$URLs[] = 'https://namemc.com/u/safeytrfyh';
$working = '';

//Curl scraper.
foreach($URLs as $URL){
$ch     = curl_init();
curl_setopt($ch, CURLOPT_URL, $URL);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);        
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$page = curl_exec($ch);
$accounts = array();
preg_match_all('#<div><span[^>]*>(.*?)</span></div>#',$page,$accounts);
foreach($accounts[0] as $account){
    $working .= ''.$account.''. PHP_EOL . '';
}
}

//Put the scraped check into the new .txt file.
file_put_contents('accounts.txt', $working, FILE_APPEND);
?>

2 个答案:

答案 0 :(得分:0)

通常更简单/效率更低的方法通常是使用整齐的前端遍历HTML结构,例如QueryPath等qp($html)->find(".alert-danger .alert-link")->text()。虽然实际上对于具体任务看起来不太可靠。

现在,如果出于某种原因,您不想查看HTML源代码,调整正则表达式,或者不知道占位符如何工作;那么一个更简单的选择就是匹配原始文本

$text = strip_tags($html);
preg_match_all("/(\w+) \s+ is \s+ available/x", $text, $matches);

其中\w+代表单词字符,\s+代表空格,/x代表可读性。

答案 1 :(得分:0)

You can convert page in to DOM object can get what ever you want as: 

    <?php
            $url = "http://stackoverflow.com/";
            $ch     = curl_init();
            curl_setopt($ch, CURLOPT_URL, $url);
            curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); //  if page is https (use if you are using local host)
            curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36');
            curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
            curl_setopt($ch, CURLOPT_HEADER, 1);        
            curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);

             $page = curl_exec($ch);  //  Can echo to check page 

                $dom = new DOMDocument();
                @$dom->loadHTML($page);
                $xpath = new DOMXPath( $dom );
                $query21 = '//div[@id="question-mini-list"]//h3//a[@class="question-hyperlink"]' ; 
                $nodes21 = $xpath->query( $query21 );  

                $title = "questions.txt";
                $file_title = fopen($title, 'w');

                foreach( $nodes21 as $node21 )
                  {
                    $tit = trim($node21->nodeValue);  // HEADING 
                    fwrite($file_title, $tit . "\r\n");
                   }        

    ?>

OUTPUT as:

    I have an araay in one file and i want to find the size of it in another file using “sizeof” , i dont want to use any extra variables?
    No Activity found to handle Intent act=android.intent.action.VIEW when trying to play an audio file
    Naming a variable dynamically in Ruby
    uanble to use Bootstrap Notify with angular js in mvc application
    How do I combine a bootstrap carousel with a sidebar menu?
    Stop pausing when mouse hover -Full Slider
    How to Let Recordset #2 in the Same Position as the Similar Recordset#1
    Bash backup script. Read list of files. [OS X]
    extracting multiple columns from mt0 in hspice simulations using awk command
    Can't invoke *method=* type methods in instance_eval
    Couldnt understand the Array behavior in ruby
    Could not connect to sql server using msado15.dll in c++
    Slick 3.0.0 AutoIncrement Composite Key
    Swift Error type 'usersVC' does not conform to protocol 'UITableViewDataSource'
    Installation Error Unknown Failure
    Is it possible to 'emulate' a regular post that loads a new page in angularjs? or plain java as a backup?
    puppet file protocol handle throws Could not evaluate
    Hazard of load address in mips
    how to post multipal files to a url from jscript?
    How to organize the viewmodel of tableview with section in reactiveUI
    CQRS with legacy MSSQL database
    Should I use Blob storage or Azure VM storage for files?
    Copy cell content from a column to another column in matlab
    How do I debug a crash on iOS device from a crash log
    Combobox in windows phone 8.1 not showing 4th and 5th element in emulator
    How to add padding in printing table in F#?
    I don't understand the SpriteAccessor class (Universal Tween Engine)
    mule reliable pattern with file streaming and JMS
    How to tell Faraday to preserve hashbang in site URL?
    maven-license-plugin by mycila (replacing license header)
    Customise `JOptionPane.YES_NO_OPTION`
    AWS: Boto SQS writing isn't saving
    Android expandable listview always scrolls down to bottom
    Inconsistency in TypeConverter behavior?
    Using function as prototype
    Adjust width of inline buttons automatically based on parent width
    GetWeek of Month, Week starts from Monday
    Has anybody tried to recreate UITableViewController with static cells?
    Why shows --“cannot pass objects of non-trivially-copyable type”?
    Search and update a string in a text file in JAVA
    What is Countdown Latch in Java MultiThreading?
    Slim Framework with ORM (Eloquent) connect multiple db
    Why isn't the frame centred in this GUI program when it is run?
    Custom Logout Handler Not Working Grails
    Response to post request to AWS “breaks the pipe”, cannot read
    how to set focus to a SearchBox control in windows 8.1 store app?
    Removing a word from after a string
    need to generate css from scss file on windows 8.1 using gruntjs compass
    Arduino YUN - complex JSON response
    How to use expandable list view in the following scenario
    Unique DB entry to the user
    R : Save big objects to disk then only load parts of them
    What is wrong based on these dbus system bus log files?
    NLP Shift reduce parser is throwing null pointer Exception for Sentiment calculation
    Excel VBA - Combine Rows with duplicate values, merge cells if different
    what's TransactionID and RowID and Roll Point size in InnoDB
    File associations in vscode
    Difference Between IEnumerable Model and Model
    efficient way of passing Data between Matlab functions
    Open new Form in same window silverlight app via c#?
    Hibernate configuration to create hbm and POJO
    FTP Client gives “ECONNREFUSED - Connection refused by server”
    Timer in Selective Repeat ARQ
    Can TXL be used for code clone detection
    MATLAB - Callback after reparenting
    Asynchronous execution with datastax mapper
    Stopping gobbler threads in blocking reads on Process InputStream
    how to get gabor filter image using opencv?
    WebView shows source html with loadDataWithBaseURL, not rendered view
    git merge forked repo to local repo
    Scrapy (Python): Iterating over 'next' page without multiple functions
    android:uiOption=“SplitActionBarWhenNarrow” does not work
    md5 hash a large file incrementally?
    Instagram relationship request endpoint registration issue
    cuda calc distance of two points
    How to share contents of ListView row on facebook in Android?
    how will the socket act when the receiving speed is larger than process speed
    cannot see particle (cocos2d-x 3.5 with Particle Designer2)
    Couldn't find FoodObject without an ID
    CardView and RecyclerView divider behaviour
    Verification google play purchase from server side
    dyld: Symbol not found: _iconv when using javac to compile on MacOS
    R not producing a figure in jupyter (IPython notebook)
    Entity Framework 6 update a table and insert into foreign key related tables
    I have integrated CLIPS with VC++(MFC), why there are some function does't execute,such as “strcmp”
    Using SelectBoxIt in AngularJS Directive
    Where is the Google Information Rights Management API?
    Open Graph in Laravel 5
    CodeIgniter 3 Unable to locate the model you have specified
    how to have a static url for shopify oauth?
    Use AnnotationReader under namespace
    No such .h file or directory(Android, Cocos2d-x, NDK)
    Getting total sum of rows and adding and removing rows using knockoutjs
    Dynamic default value for Kendo Grid
    Ruby's class expression---how is it different from `Class.new`?
    socket.emit is not working in mobile chrome (but it works in incognito mode)