Zebra cURL - a high performance cURL PHP library allowing the running of multiple asynchronous requests at once

The number of seconds to wait between processing batches of requests.

If the value of this property is greater than 0, the library will process as many requests as defined by the threads property and then wait for pause_interval seconds before processing the next batch of requests.

Default is 0 (the library will keep as many parallel threads as defined by threads running at all times until there are no more requests to process).

Tags

since:

1.3.0

top

The number of parallel, asynchronous requests to be processed by the library, at once.

// process 30 simultaneous requests at once
$curl->threads = 30;

Note that unless pause_interval is set to a value greater than 0, the library will process a constant number of requests, at all times; it is doing this by starting a new request as soon as another one finishes.

If pause_interval is set to a value greater than 0, the library will process as many requests as set by the threads property and then wait for pause_interval seconds before processing the next batch of requests.

Default is 10

top

void __construct ( [ boolean $htmlentities = true ] )

Constructor of the class.

Below is the list of default options set by the library when instantiated. Various methods of the library may overwrite some of these options when called (see delete, download, ftp_download, get, header, post, put). The value of any of these options may also be changed with the option method. For a full list of available options and their description, consult the PHP documentation.

CURLINFO_HEADER_OUT - the last request string sent
default: TRUE

CURLOPT_AUTOREFERER - TRUE to automatically set the "Referer:" field in requests where it follows a "Location:" redirect
default: TRUE

CURLOPT_COOKIEFILE - the name of the file containing the cookie data. the cookie file can be in Netscape format, or just plain HTTP-style headers dumped into a file. if the name is an empty string, no cookies are loaded, but cookie handling is still enabled
default: an empty string

CURLOPT_CONNECTTIMEOUT - the number of seconds to wait while trying to connect
default: 10 (use 0 to wait indefinitely)

CURLOPT_ENCODING - the contents of the "Accept-Encoding: " header. this enables decoding of the response. supported encodings are identity, deflate, and gzip. if an empty string is set, a header containing all supported encoding types is sent
default: gzip,deflate

CURLOPT_FOLLOWLOCATION - TRUE to follow any "Location:" header that the server sends as part of the HTTP header (note this is recursive, PHP will follow as many "Location:" headers that it is sent, unless CURLOPT_MAXREDIRS is set - see below)
default: TRUE

CURLOPT_HEADER - TRUE to include the header in the output
default: TRUE

CURLOPT_MAXREDIRS - the maximum amount of HTTP redirections to follow. use this option alongside CURLOPT_FOLLOWLOCATION - see above
default: 50

CURLOPT_RETURNTRANSFER - TRUE to return the transfer's body as a string instead of outputting it directly
default: TRUE

CURLOPT_SSL_VERIFYHOST - 1 to check the existence of a common name in the SSL peer certificate. 2 to check the existence of a common name and also verify that it matches the hostname provided. 0 to not check the names
see the ssl method for more info
default: TRUE

CURLOPT_SSL_VERIFYPEER - FALSE to stop cURL from verifying the peer's certificate
see the ssl method for more info
default: TRUE

CURLOPT_TIMEOUT - the maximum number of seconds to allow cURL functions to execute
default: 10

CURLOPT_USERAGENT - a (slightly) random user agent (Internet Explorer 9 or 10, on Windows Vista, 7 or 8, with other extra strings). Some web services will not respond unless a valid user-agent string is provided

Arguments

boolean

$htmlentities

Optional Instructs the script whether the response body returned by the get and post methods should be run through PHP's htmlentities function.

Default is TRUE

top

void cache ( mixed $path , [ integer $lifetime = 3600 ] , [ boolean $compress = true ] , [ integer $chmod = 0755 ] )

Enables caching of request results.

Note that in case of downloads, only the actual request is cached and not the associated downloads

// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// cache results in the "cache" folder and for 86400 seconds (24 hours)
$curl->cache('cache', 86400);
// fetch the RSS feeds of some popular tech-related websites
// and execute a callback function for each request, as soon as it finishes
$curl->get(array(
'https://alistapart.com/main/feed/',
'https://www.smashingmagazine.com/feed/',
'https://code.tutsplus.com/posts.atom',
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
), function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
});

Arguments

mixed	$path	Path where cache files to be stored. Setting this to `FALSE` will disable caching. If set to a non-existing path, the library will try to create the folder and will trigger an error if, for whatever reasons, it is unable to do so. If the folder can be created, its permissions will be set to the value of the $chmod argument.
integer	$lifetime	Optional The number of seconds after which cache will be considered expired. Default is `3600` (one hour).
boolean	$compress	Optional If set to `TRUE`, cache files will be gzcompress-ed so that they occupy less disk space. Default is `TRUE`.
integer	$chmod	Optional The file system permissions to be set for newly created cache files. I suggest using the value `0755` but, if you know what you are doing, here is how you can calculate the permission levels: 400 Owner Read 200 Owner Write 100 Owner Execute 40 Group Read 20 Group Write 10 Group Execute 4 Global Read 2 Global Write 1 Global Execute Default is `0755`.

top

void cookies ( string $path )

Sets the path and name of the file to save cookie to / retrieve cookies from. All cookie data will be stored in this file on a per-domain basis. Important when cookies need to stored/restored to maintain status/session of requests made to the same domains.

This method will automatically set the CURLOPT_COOKIEJAR and CURLOPT_COOKIEFILE options.

Arguments

string

$path

The path to a file to save cookies to / retrieve cookies from.

If file does not exist the library will attempt to create it and, if it is unable to do so, it will trigger an error.

top

void delete ( mixed $urls , [ mixed $callback = '' ] )

Performs an HTTP DELETE request to one or more URLs with optional POST data, and executes the callback function specified by the $callback argument for each and every request, as soon as the request finishes.

This method will automatically set the following options:

CURLINFO_HEADER_OUT = TRUE
CURLOPT_CUSTOMREQUEST = DELETE
CURLOPT_HEADER = TRUE
CURLOPT_NOBODY = FALSE
CURLOPT_POST = FALSE
CURLOPT_POSTFIELDS = the POST data

...and will unset the following options:

CURLOPT_HTTPGET
CURLOPT_FILE

For PHP < 5.1.2 CURLOPT_BINARYTRANSFER is also unset.
For newer versions of PHP this option is not used as it has no effect and has been deprecated starting with PHP 8.4.

Multiple requests are processed asynchronously, in parallel, and the callback function is called for each and every request as soon as the request finishes. The number of parallel requests to be constantly processed, at all times, is set through the threads property. See also pause_interval.

Because requests are done asynchronously, when initiating multiple requests at once, these may not finish in the order in which they were initiated!

// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// do a DELETE request
// and execute a callback function for each request, as soon as it finishes
$curl->delete(array(
'https://www.somewebsite.com' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
),
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
), function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
});

Arguments

mixed

$urls

URL(s) to send the request(s) to.

Read full description of the argument at the post method.

mixed

$callback

Optional Callback function to be called as soon as the request finishes.

Read full description of the argument at the get method.

Tags

since:

1.3.3

top

void download ( mixed $urls , string $path , [ mixed $callback = '' ] )

Downloads one or more files from one or more URLs, saves the downloaded files to the path specified by the $path argument, and executes the callback function specified by the $callback argument for each and every request, as soon as the request finishes.

If the path you are downloading from refers to a file, the file's original name will be preserved but, if you are downloading a file generated by a script (i.e. https://foo.com/bar.php?w=1200&h=800), the downloaded file's name will be random generated. Refer to the downloaded file's name in the result's info attribute, in the downloaded_filename section - see the example below.

If you are downloading multiple files with the same name the later ones will overwrite the previous ones.

Downloads are streamed (bytes downloaded are directly written to disk) removing the unnecessary strain from your server of reading files into memory first, and then writing them to disk.

This method will automatically set the following options:

CURLINFO_HEADER_OUT = TRUE
CURLOPT_HEADER = TRUE
CURLOPT_FILE

...and will unset the following options:

CURLOPT_CUSTOMREQUEST
CURLOPT_HTTPGET
CURLOPT_NOBODY
CURLOPT_POST
CURLOPT_POSTFIELDS

For PHP < 5.1.2 CURLOPT_BINARYTRANSFER will also be set to TRUE.
For newer versions of PHP this option is not used as it has no effect and has been deprecated starting with PHP 8.4.

Multiple requests are processed asynchronously, in parallel, and the callback function is called for each and every request as soon as the request finishes. The number of parallel requests to be constantly processed, at all times, is set through the threads property. See also pause_interval.

Because requests are done asynchronously, when initiating multiple requests at once, these may not finish in the order in which they were initiated!

// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// download 2 images from 2 different websites
// and execute a callback function for each request, as soon as it finishes
$curl->download(array(
'https://www.somewebsite.com/images/alpha.jpg',
'https://www.otherwebsite.com/images/omega.jpg',
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
), 'destination/path/', function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// get the downloaded file's path
$result->info['downloaded_filename'];
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
});

Arguments

mixed

$urls

URL(s) to send the request(s) to.

Can be any of the following:

// a string
$curl->download('https://address.com/file.foo', 'path', 'callback');
// an array, for multiple requests
$curl->download(array(
'https://address1.com/file1.foo',
'https://address2.com/file2.bar',
), 'path', 'callback');

If custom options need to be set for each request, use the following format:

// this can also be an array of arrays, for multiple requests
$curl->download(array(
// mandatory!
'url' => 'https://address.com/file.foo',
// optional, used to set any cURL option
// in the same way you would set with the options() method
'options' => array(
CURLOPT_USERAGENT => 'Dummy scraper 1.0',
),
), 'path', 'callback');

string

$path

The path to where to save the file(s) to.

If path is not pointing to a directory or the directory is not writable, the library will trigger an error.

mixed

$callback

Optional Callback function to be called as soon as the request finishes.

Read full description of the argument at the get method.

top

void ftp_download ( mixed $urls , string $path , [ string $username = '' ] , [ string $password = '' ] , [ mixed $callback = '' ] )

Works exactly like the download method but downloads are made from an FTP server.

Downloads one or more files from an FTP server, to which the connection is made using the given $username and $password arguments, saves the downloaded files (with their original name) to the path specified by the $path argument, and executes the callback function specified by the $callback argument for each and every request, as soon as the request finishes.

Downloads are streamed (bytes downloaded are directly written to disk) removing the unnecessary strain from your server of reading files into memory first, and then writing them to disk.

This method will automatically set the following options:

CURLINFO_HEADER_OUT = TRUE
CURLOPT_HEADER = TRUE
CURLOPT_FILE

...and will unset the following options:

CURLOPT_CUSTOMREQUEST
CURLOPT_HTTPGET
CURLOPT_NOBODY
CURLOPT_POST
CURLOPT_POSTFIELDS

For PHP < 5.1.2 CURLOPT_BINARYTRANSFER will also be set to TRUE.
For newer versions of PHP this option is not used as it has no effect and has been deprecated starting with PHP 8.4.

If you are downloading multiple files with the same name the later ones will overwrite the previous ones.

Multiple requests are processed asynchronously, in parallel, and the callback function is called for each and every request as soon as the request finishes. The number of parallel requests to be constantly processed, at all times, is set through the threads property. See also pause_interval.

Because requests are done asynchronously, when initiating multiple requests at once, these may not finish in the order in which they were initiated!

// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// connect to the FTP server using the given credential, download a file to a given location
// and execute a callback function for each request, as soon as it finishes
$curl->ftp_download(
'ftp://somefile.ext',
'destination/path',
'username',
'password',
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
}
);

Arguments

mixed	$urls	URL(s) to send the request(s) to. Can be any of the following: // a string $curl->ftp_download( 'ftp://address.com/file.foo', 'destination/path', 'username', 'password', 'callback' ); // an array, for multiple requests $curl->ftp_download(array( 'ftp://address1.com/file1.foo', 'ftp://address2.com/file2.bar', ), 'destination/path', 'username', 'password', 'callback'); If custom options need to be set for each request, use the following format: // this can also be an array of arrays, for multiple requests $curl->ftp_download(array( // mandatory! 'url' => 'ftp://address.com/file.foo', // optional, used to set any cURL option // in the same way you would set with the options() method 'options' => array( CURLOPT_USERAGENT => 'Dummy scraper 1.0', ), ), 'destination/path', 'username', 'password', 'callback'); Note that in all the examples above, you are downloading files from a single FTP server. To make requests to multiple FTP servers, set the `CURLOPT_USERPWD` option yourself. The $username and $password arguments will be overwritten by the values set like this. $curl->ftp_download(array( array( 'url' => 'ftp://address1.com/file1.foo', 'options' => array( CURLOPT_USERPWD => 'username1:password1', ), ), array( 'url' => 'ftp://address2.com/file2.foo', 'options' => array( CURLOPT_USERPWD => 'username2:password2', ), ), ), 'destination/path', '', '', 'callback');
string	$path	The path to where to save the file(s) to. If path is not pointing to a directory or is not writable, the library will trigger an error.
string	$username	Optional The username to be used to connect to the FTP server (if required).
string	$password	Optional The password to be used to connect to the FTP server (if required).
mixed	$callback	Optional Callback function to be called as soon as the request finishes. Read full description of the argument at the get method.

top

void get ( mixed $urls , [ mixed $callback = '' ] )

Performs an HTTP GET request to one or more URLs and executes the callback function specified by the $callback argument for each and every request, as soon as the request finishes.

This method will automatically set the following options:

CURLINFO_HEADER_OUT = TRUE
CURLOPT_HEADER = TRUE
CURLOPT_HTTPGET = TRUE
CURLOPT_NOBODY = FALSE

...and will unset the following options:

CURLOPT_BINARYTRANSFER
CURLOPT_CUSTOMREQUEST
CURLOPT_FILE
CURLOPT_POST
CURLOPT_POSTFIELDS

Multiple requests are processed asynchronously, in parallel, and the callback function is called for each and every request as soon as the request finishes. The number of parallel requests to be constantly processed, at all times, is set through the threads property. See also pause_interval.

Because requests are done asynchronously, when initiating multiple requests at once, these may not finish in the order in which they were initiated!

// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// cache results in the "cache" folder and for 3600 seconds (one hour)
$curl->cache('cache', 3600);
// let's fetch the RSS feeds of some popular websites
// execute the callback function for each request, as soon as it finishes
$curl->get(array(
'https://alistapart.com/main/feed/',
'https://www.smashingmagazine.com/feed/',
'https://code.tutsplus.com/posts.atom',
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
), function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
});

Arguments

mixed

$urls

URL(s) to send the request(s) to.

Can be any of the following:

// a string
$curl->get('https://address.com/', 'callback');
// an array, for multiple requests
$curl->get(array(
'https://address1.com/',
'https://address2.com/',
), 'callback');

If custom options need to be set for each request, use the following format:

// this can also be an array of arrays, for multiple requests
$curl->get(array(
// mandatory!
'url' => 'https://address.com/',
// optional, used to set any cURL option
// in the same way you would set with the options() method
'options' => array(
CURLOPT_USERAGENT => 'Dummy scraper 1.0',
),
// optional, you can pass arguments this way also
'data' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
),
), 'callback');

mixed

$callback

Optional Callback function to be called as soon as the request finishes.

May be given as a string representing the name of an existing function, or as an anonymous function.

The callback function receives as first argument an object with 4 properties as described below. Any extra arguments passed to the download method will be passed as extra arguments to the callback function:

info - an associative array containing information about the request that just finished, as returned by PHP's curl_getinfo() function

headers - an associative array with 2 items:

- last_request - an array with a single entry containing the request headers generated by the last request
  therefore, when redirects are involved, only information from the last request will be available
  if explicitly disabled by setting CURLINFO_HEADER_OUT to 0 or FALSE through the option method, this will be an empty string

- responses an empty string as it is not available for this method

body - the response of the request (the content of the page at the URL).

Unless disabled via the constructor, all applicable characters will be converted to HTML entities via PHP's htmlentities function, so remember to use PHP's html_entity_decode function in case you need the decoded values

if explicitly disabled by setting CURLOPT_NOBODY to 0 or FALSE through the option method, this will be an empty string

response - the response given by the cURL library as an array with 2 items:

- the textual representation of the result's code (i.e. CURLE_OK)

- the result's code (i.e. 0)

If the callback function returns FALSE while caching is enabled, the library will not cache the respective request, making it easy to retry failed requests without having to clear all cache.

top

void header ( mixed $urls , [ mixed $callback = '' ] )

Works exactly like the get method, the only difference being that this method will only return the headers, without body.

This method will automatically set the following options:

CURLINFO_HEADER_OUT = TRUE
CURLOPT_HEADER = TRUE
CURLOPT_HTTPGET = TRUE
CURLOPT_NOBODY = TRUE

...and will unset the following options:

CURLOPT_BINARYTRANSFER
CURLOPT_CUSTOMREQUEST
CURLOPT_FILE
CURLOPT_POST
CURLOPT_POSTFIELDS

Multiple requests are processed asynchronously, in parallel, and the callback function is called for each and every request as soon as the request finishes. The number of parallel requests to be constantly processed, at all times, is set through the threads property. See also pause_interval.

Because requests are done asynchronously, when initiating multiple requests at once, these may not finish in the order in which they were initiated!

// instantiate the class
$curl = new Zebra_cURL();
// process given URLs
// and execute a callback function for each request, as soon as it finishes
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
$curl->header('https://www.somewebsite.com', function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
});

Arguments

mixed

$urls

URL(s) to send the request(s) to.

Read full description of the argument at the get method.

mixed

$callback

Optional Callback function to be called as soon as the request finishes.

Read full description of the argument at the get method.

top

void http_authentication ( [ string $username = '' ] , [ string $password = '' ] , [ integer $type = CURLAUTH_ANY ] )

Use this method to make requests to pages that require prior HTTP authentication.

// instantiate the class
$curl = new Zebra_cURL();
// prepare user name and password
$curl->http_authentication('username', 'password');
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// get content from a page that requires prior HTTP authentication
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
$curl->get('https://www.some-page-requiring-prior-http-authentication.com', function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
});

If you have to unset previously set values use

$curl->http_authentication();

Arguments

string $username User name to be used for authentication.

string $password Password to be used for authentication.

integer

$type

Optional The HTTP authentication method(s) to use. The options are:

CURLAUTH_BASIC
CURLAUTH_DIGEST
CURLAUTH_GSSNEGOTIATE
CURLAUTH_NTLM
CURLAUTH_ANY
CURLAUTH_ANYSAFE

The bitwise | (or) operator can be used to combine more than one method. If this is done, cURL will poll the server to see what methods it supports and pick the best one.

CURLAUTH_ANY is an alias for
CURLAUTH_BASIC | CURLAUTH_DIGEST | CURLAUTH_GSSNEGOTIATE | CURLAUTH_NTLM

CURLAUTH_ANYSAFE is an alias for
CURLAUTH_DIGEST | CURLAUTH_GSSNEGOTIATE | CURLAUTH_NTLM

Default is CURLAUTH_ANY

top

void option ( mixed $option , [ mixed $value = '' ] )

Allows the setting of one or more cURL options.

// instantiate the class
$curl = new Zebra_cURL();
// setting a single option
$curl->option(CURLOPT_CONNECTTIMEOUT, 10);
// setting multiple options at once
$curl->option(array(
CURLOPT_TIMEOUT => 10,
CURLOPT_CONNECTTIMEOUT => 10,
));
// requests are made here...

Arguments

mixed

$option

A single option for which to set a value, or an associative array in the form of option => value.

Setting a value to null will unset that option.

mixed

$value

Optional If the $option argument is not an array, then this argument represents the value to be set for the respective option. If the $option argument is an array, the value of this argument will be ignored.

Setting a value to null will unset that option.

top

void patch ( mixed $urls , [ mixed $callback = '' ] )

Performs an HTTP PATCH request to one or more URLs and executes the callback function specified by the $callback argument for each and every request, as soon as the request finishes.

This method will automatically set the following options:

CURLINFO_HEADER_OUT - TRUE
CURLOPT_CUSTOMREQUEST - PATCH
CURLOPT_HEADER - TRUE
CURLOPT_NOBODY - FALSE
CURLOPT_POST - FALSE
CURLOPT_POSTFIELDS - the POST data

...and will unset the following options:

CURLOPT_BINARYTRANSFER
CURLOPT_HTTPGET
CURLOPT_FILE

Multiple requests are processed asynchronously, in parallel, and the callback function is called for each and every request as soon as the request finishes. The number of parallel requests to be constantly processed, at all times, is set through the threads property. See also pause_interval.

Because requests are done asynchronously, when initiating multiple requests at once, these may not finish in the order in which they were initiated!

// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// do a PATCH request and execute a callback function for each request, as soon as it finishes
$curl->patch(array(
'https://www.somewebsite.com' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
),
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
), function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
});

Arguments

mixed

$urls

URL(s) to send the request(s) to.

Read full description of the argument at the post method.

mixed

$callback

Optional Callback function to be called as soon as the request finishes.

Read full description of the argument at the get method.

Tags

since:

1.6.0

top

void post ( mixed $urls , [ mixed $callback = '' ] )

Performs an HTTP POST request to one or more URLs and executes the callback function specified by the $callback argument for each and every request, as soon as the request finishes.

This method will automatically set the following options:

CURLINFO_HEADER_OUT = TRUE
CURLOPT_HEADER = TRUE
CURLOPT_NOBODY = FALSE
CURLOPT_POST = TRUE
CURLOPT_POSTFIELDS = the POST data

...and will unset the following options:

CURLOPT_BINARYTRANSFER
CURLOPT_CUSTOMREQUEST
CURLOPT_HTTPGET
CURLOPT_FILE

Multiple requests are processed asynchronously, in parallel, and the callback function is called for each and every request as soon as the request finishes. The number of parallel requests to be constantly processed, at all times, is set through the threads property. See also pause_interval.

Because requests are done asynchronously, when initiating multiple requests at once, these may not finish in the order in which they were initiated!

// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// do a POST request and execute a callback function for each request, as soon as it finishes
$curl->post(array(
'https://www.somewebsite.com' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
),
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
), function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
});

When uploading a file, we need to prefix the file name with @

$curl->post(array(
'https://www.somewebsite.com' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
'data_3' => '@absolute/path/to/file.ext',
), 'mycallback');

Arguments

mixed

$urls

URL(s) to send the request(s) to.

Can be any of the following:

// a string (no POST values sent)
$curl->post('https://address.com');
// an array, for multiple requests (no POST values sent)
$curl->post(array(
'https://address1.com',
'https://address2.com',
));
// an associative array in the form of Array(url => post-data),
// where "post-data" is an associative array in the form of
// Array(name => value) and represents the value(s) to be set for
// CURLOPT_POSTFIELDS;
// "post‑data" can also be an arbitrary string - useful if you
// want to send raw data (like a JSON)
$curl->post(array('https://address.com' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
)));
// just like above but an *array* of associative arrays, for
// multiple requests
$curl->post(array(
array('https://address.com1' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
)),
array('https://address.com2' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
)),
));

If custom options need to be set for each request, use the following format:

// this can also be an array of arrays, for multiple requests
$curl->post(array(
// mandatory!
'url' => 'https://address.com',
// optional, used to set any cURL option
// in the same way you would set with the options() method
'options' => array(
CURLOPT_USERAGENT => 'Dummy scraper 1.0',
),
// optional, if you need to pass any arguments
// (equivalent of setting CURLOPT_POSTFIELDS using
// the "options" entry above)
'data' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
),
));

To post a file, prepend the filename with @ and use the full server path.

For PHP 5.5+ files are uploaded using CURLFile and CURLOPT_SAFE_UPLOAD will be set to TRUE.

For lower PHP versions, files will be uploaded the old way and the file's mime type should be explicitly specified by following the filename with the type in the format ';type=mimetype' as most of the times cURL will send the wrong mime type...

$curl->post(array('https://address.com' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
'data_3' => '@absolute/path/to/file.ext',
)));

If any data is sent, the "Content-Type" header will be set to "multipart/form-data"

mixed

$callback

Optional Callback function to be called as soon as the request finishes.

Read full description of the argument at the get method.

top

void proxy ( string $proxy , [ integer $port = 80 ] , [ string $username = '' ] , [ string $password = '' ] )

Instructs the library to tunnel all requests through a proxy server.

// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// connect to a proxy server
// (that's a random one i got from https://www.proxynova.com/proxy-server-list/)
$curl->proxy('91.221.252.18', '8080');
// fetch a page and execute a callback function when done
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
$curl->get('https://www.somewebsite.com/', function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
});

Arguments

string	$proxy	The HTTP proxy to tunnel requests through. Can be an URL or an IP address. This option can also be set using the option method and setting `CURLOPT_PROXY` to the desired value. Setting this argument to `FALSE` will unset all the proxy-related options.
integer	$port	Optional The port number of the proxy to connect to. Default is `80`. This option can also be set using the option method and setting `CURLOPT_PROXYPORT` to the desired value.
string	$username	Optional The username to be used for the connection to the proxy (if required by the proxy) Default is `""` (an empty string) The username and the password can also be set using the option method and setting `CURLOPT_PROXYUSERPWD` to the desired value formatted like `[username]:[password]`.
string	$password	Optional The password to be used for the connection to the proxy (if required by the proxy) Default is `""` (an empty string) The username and the password can also be set using the option method and setting `CURLOPT_PROXYUSERPWD` to the desired value formatted like `[username]:[password]`.

top

void put ( mixed $urls , [ mixed $callback = '' ] )

Performs an HTTP PUT request to one or more URLs and executes the callback function specified by the $callback argument for each and every request, as soon as the request finishes.

This method will automatically set the following options:

CURLINFO_HEADER_OUT - TRUE
CURLOPT_CUSTOMREQUEST - PUT
CURLOPT_HEADER - TRUE
CURLOPT_NOBODY - FALSE
CURLOPT_POST - FALSE
CURLOPT_POSTFIELDS - the POST data

...and will unset the following options:

CURLOPT_BINARYTRANSFER
CURLOPT_HTTPGET
CURLOPT_FILE

Multiple requests are processed asynchronously, in parallel, and the callback function is called for each and every request as soon as the request finishes. The number of parallel requests to be constantly processed, at all times, is set through the threads property. See also pause_interval.

Because requests are done asynchronously, when initiating multiple requests at once, these may not finish in the order in which they were initiated!

// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// do a PUT request and execute a callback function for each request, as soon as it finishes
$curl->put(array(
'https://www.somewebsite.com' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
),
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
), function($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
});

Arguments

mixed

$urls

URL(s) to send the request(s) to.

Read full description of the argument at the post method.

mixed

$callback

Optional Callback function to be called as soon as the request finishes.

Read full description of the argument at the get method.

Tags

since:

1.3.3

top

void queue ()

Instructs the library to queue requests rather than processing them right away. Useful for grouping different types of requests and treat them as a single request.

Until start method is called, all calls to delete, download, ftp_download, get, header, post and put methods will queue up rather than being executed right away. Once the start method is called, all queued requests will be processed while values of threads and pause_interval properties will still apply.

// the callback function to be executed for each and every
// request, as soon as the request finishes
// the callback function receives as argument an object with 4 properties
// (info, header, body and response)
function mycallback($result) {
// everything went well at cURL level
if ($result->response[1] == CURLE_OK) {
// if server responded with code 200 (meaning that everything went well)
// see https://httpstatus.es/ for a list of possible response codes
if ($result->info['http_code'] == 200) {
// see all the returned data
print_r('<pre>');
print_r($result);
// show the server's response code
} else trigger_error('Server responded with code ' . $result->info['http_code'], E_USER_ERROR);
// something went wrong
// ($result still contains all data that could be gathered)
} else trigger_error('cURL responded with: ' . $result->response[0], E_USER_ERROR);
}
// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// queue requests - useful for grouping different types of requests
// in this example, when the "start" method is called, we'll execute
// the "get" and the "post" requests asynchronously
$curl->queue();
// do a POST and execute the callback function when done
$curl->post(array(
'https://www.somewebsite.com' => array(
'data_1' => 'value 1',
'data_2' => 'value 2',
),
), 'mycallback');
// fetch the RSS feeds of some popular websites
// and execute the callback function for each request, as soon as it finishes
$curl->get(array(
'https://alistapart.com/main/feed/',
'https://www.smashingmagazine.com/feed/',
'https://code.tutsplus.com/posts.atom',
), 'mycallback')
// execute queued requests
$curl->start();

Tags

since:

1.3.0

top

mixed scrap ( mixed $url , [ boolean $body_only = true ] )

Same as the scrape() method but with incorrect name.

Kept for backward compatibility purposes.

Arguments

mixed

$url

An URL to fetch.

Note that this method only supports a single URL. For processing multiple URLs at once, see the get method.

boolean

$body_only

Optional When set to TRUE, will instruct the method to return only the page's content, without info, headers, responses, etc.

When set to FALSE, will instruct the method to return everything it can about the scraped page, as an object with properties as described for the $callback argument of the get method.

Default is TRUE.

Tags

return: Returns the scraped page's content, when $body_only is set to TRUE, or an object with properties as described for the $callback argument of the get method.

top

mixed scrape ( mixed $url , [ boolean $body_only = true ] )

A shorthand for making a single get request without the need of a callback function.

// instantiate the class
$curl = new Zebra_cURL();
// if making requests over HTTPS we need to load a CA bundle
// so we don't get CURLE_SSL_CACERT response from cURL
// you can get this bundle from https://curl.se/docs/caextract.html
$curl->ssl(true, 2, 'path/to/cacert.pem');
// get page's content only
$content = $curl->scrape('https://www.somewebsite.com/');
// print that to screen
echo $content;
// also get extra information about the page
$content = $curl->scrape('https://www.somewebsite.com/', false);
// print that to screen
print_r('<pre>');
print_r($content);

Prior to 1.6.3 this method's name was incorrectly scrap. For backward compatibility purposes that variant is also available to use but highly discouraged as it will be removed in the future.

Arguments

mixed

$url

An URL to fetch.

Note that this method only supports a single URL. For processing multiple URLs at once, see the get method.

boolean

$body_only

Optional When set to TRUE, will instruct the method to return only the page's content, without info, headers, responses, etc.

When set to FALSE, will instruct the method to return everything it can about the scraped page, as an object with properties as described for the $callback argument of the get method.

Default is TRUE.

Tags

return:	Returns the scraped page's content, when $body_only is set to `TRUE`, or an object with properties as described for the $callback argument of the get method.
since:	1.3.3

top

void ssl ( [ boolean $verify_peer = true ] , [ integer $verify_host = 2 ] , [ mixed $file = false ] , [ mixed $path = false ] )

Requests made over HTTPS usually require additional configuration, depending on the server. Most of the times the defaults set by the library will get you through but, if defaults are not working, you can set specific options using this method.

// instantiate the class
$curl = new Zebra_cURL();
// instruct the library to skip verifying peer's SSL certificate
// (ignored if request is not made through HTTPS)
$curl->ssl(false);
// fetch a page
$curl->get('https://www.somewebsite.com/', function($result) {
print_r("<pre>");
print_r($result);
});

Arguments

boolean	$verify_peer	Optional Should the peer's certificate be verified by cURL? Default is `TRUE`. This option can also be set using the option method and setting `CURLOPT_SSL_VERIFYPEER` to the desired value. When you are communicating over HTTPS (or any other protocol that uses TLS), it will, by default, verify that the server is signed by a trusted Certificate Authority (CA) and it will most likely fail. When it does fail, instead of disabling this check, better download the CA bundle from Mozilla and reference it through the $file argument below.
integer	$verify_host	Optional Specifies whether to check the existence of a common name in the SSL peer certificate and that it matches with the provided hostname. `1` to check the existence of a common name in the SSL peer certificate `2` to check the existence of a common name and also verify that it matches the hostname provided; in production environments the value of this option should be kept at `2`; Default is `2` Support for value 1 removed in cURL 7.28.1 This option can also be set using the option method and setting `CURLOPT_SSL_VERIFYHOST` to the desired value.
mixed	$file	Optional An absolute path to a file holding the certificates to verify the peer with. This only makes sense if `CURLOPT_SSL_VERIFYPEER` is set to `TRUE`. Default is `FALSE`. This option can also be set using the option method and setting `CURLOPT_CAINFO` to the desired value.
mixed	$path	Optional An absolute path to a directory that holds multiple CA certificates. This only makes sense if `CURLOPT_SSL_VERIFYPEER` is set to `TRUE`. Default is `FALSE`. This option can also be set using the option method and setting `CURLOPT_CAPATH` to the desired value.

top

void start ()

Executes queued requests.

See queue method.

Tags

since:

1.3.0

top

Class: Zebra_cURL

source file: /Zebra_cURL.php

Author(s):

Version:

License:

Copyright:

Properties

Methods

Class properties

integer $pause_interval public

Tags

integer $threads public

Class methods

constructor __construct()

Arguments

method cache()

Arguments

method cookies()

Arguments

method delete()

Arguments

Tags

method download()

Arguments

method ftp_download()

Arguments

method get()

Arguments

method header()

Arguments

method http_authentication()

Arguments

method option()

Arguments

method patch()

Arguments

Tags

method post()

Arguments

method proxy()

Arguments

method put()

Arguments

Tags

method queue()

Tags

method scrap()

Arguments

Tags

method scrape()

Arguments

Tags

method ssl()

Arguments

method start()

Tags