PHP Classes

Link Checker: Find broken links in a Web site

Recommend this page to a friend!
  All requests RSS feed  >  Link Checker  >  Request new recommendation  >  A request is featured when there is no good recommended package on the site when it is posted. Featured requests  >  No recommendations No recommendations  

Link Checker

A request is featured when there is no good recommended package on the site when it is posted. Edit

Picture of Stephen Johns by Stephen Johns - 7 years ago (2017-08-28)

Find broken links in a Web site

This request is clear and relevant.
This request is not clear or is not relevant.

+5

I need a way to find broken links in a Web site.

  • 1 Clarification request
  • 1. Picture of Alekos Psimikakis by Alekos Psimikakis - 7 years ago (2017-09-25) Reply

    The description of your need is ambiguous. I believe you mean 'dead linksâ'. OK, but how do you want to use this 'finder'?

    If you just need to check if a link exists, google <php check if a link exists> (w/o quotes). There are plenty of examples. Use the following, it's good: "How can I check if a URL exists via PHP? - Stack Overflow" stackoverflow.com / questions / 2280394 / how-can-i-check-if-a-url-exists-via-php

    Ask clarification

    5 Recommendations

    PHP Get HTTP Status Code from URL: Access a page and return the HTTP status code

    This class can access a page and return the HTTP status code.

    It sends a HTTP request to a page with a given URL and retrieves the response.

    The class returns the server response status code number so it is possible to determine if the page is available or not.
    This recommendation solves the problem.
    This recommendation does not solve the problem.

    +1

    Picture of Jason Olson by Jason Olson package author package author Reputation 110 - 6 years ago (2019-03-05) Comment

    This class can be used to take a given URL and return the HTTP status code of the page, for example 404 for page not found, or 200 for found, or 301 for redirect, etc. It's not certain if you're looking to simply test a database/list of specific URLs or if you're looking to crawl a page/site looking for bad links. If you're looking to crawl it would be helpful to also know if you're looking for internal bad links or external links.


    Very simple page details: Parse and extract Web page information details

    This class can parse and extract Web page information details.

    It can retrieve a Web page from a given URL and parse it to extract details like:

    - Page title
    - Page head and body
    - Meta tags
    - Character set
    - Links expanded to full path
    - Images
    - Page headers from H1 through H6
    - Internal and external links checking if they are broken
    - Page elements by class or id value
    This recommendation solves the problem.
    This recommendation does not solve the problem.

    +2

    Picture of zinsou A.A.E.Moïse by zinsou A.A.E.Moïse package author package author Reputation 6835 - 7 years ago (2017-09-16) Comment

    you can try this ...it has a static method to check if any given url is a broken link and it has 3 other methods to get all brokens internal links,all broken externals link,or simply all internal and external broken link of a given web page or local file...The package has many other method to get more details about a given page...

    • 1 Comment
    • 5. Picture of Mutale Mulenga by Mutale Mulenga - 4 years ago (2021-01-20) in reply to comment 4 by zinsou A.A.E.Moïse Reply

      Your code has given me a very big leap in my efforts to add services to my clients. Thank you very much.


    PHP Link Checker: Extract and check links on a page

    This class can extract and check links on a page.

    It can retrieve the contents of a page with a given URL and extracts the links it contains.

    The class can check if the links the page contains point to valid pages.

    The results are outputted to a given output stream.
    This recommendation solves the problem.
    This recommendation does not solve the problem.

    +3

    Picture of Maik Greubel by Maik Greubel package author package author Reputation 185 - 7 years ago (2017-09-16) Comment

    You can try this package, it will check all anchor links on a given site for existance (http status 200)


    PHP CURL Component: Compose and execute HTTP requests with Curl

    This package can compose and execute HTTP requests with Curl.

    It provides a fluent interface to define several parameters of a HTTP request to be sent to a given URL using the Curl library.

    Currently it provides means to define the request URL, request method (POST, GET, DELETE, PATCH and PUT), request parameter values, timeout values.

    Other calls can tell the package to execute the request and retrieve the response.
    This recommendation solves the problem.
    This recommendation does not solve the problem.

    +2

    Picture of Fernando by Fernando Reputation 70 - 7 years ago (2017-08-30) Comment

    I do not think there is a package to handle that. It's basically send a request with the links and analyze the response. Use Curl to accomplish that.


    PHP HTTP protocol client: HTTP client to access Web site pages

    Class that implements requests to Web resources using the HTTP protocol.

    It features:

    - May submit HTTP requests with any method, to any page, to any server, connecting to any port.
    - Provides support to setup connection and request arguments from a given URL.
    - May submit requests via a proxy server with support for authentication if necessary.
    - May establish connections via a SOCKS server.
    - Supports HTTP direct access or proxy based authentication mechanisms via SASL class library like HTTP Basic, HTTP Digest or NTLM (Windows or Samba).
    - Support secure connections (https) via Curl library with SSL support, or at least PHP 4.3.0 with OpenSSL support, or via a non-SSL HTTP proxy server.
    - Supports accessing secure pages using SSL certificates and private keys using Curl library
    - Supports user defined request headers.
    - Supports POST requests with a user defined array of form values.
    - Supports POST requests with a user defined request bodies for instance for making requests to SOAP services.
    - Supports streaming requests that require uploading large amounts of data of undefined length in small chunks to avoid exceeding PHP memory limits
    - Supports requests to sites hosting virtual Web servers.
    - Retrieves the HTTP response headers and body data separately.
    - Support HTTP 1.1 chunked content encoding
    - Supports session and persistent cookies.
    - Provides optional handling of redirected pages.
    - Supports defining connection and data transfer timeout values.
    - Can output connection debug information in plain text or formatted as HTML.
    - An add-on class is provided to login to Yahoo sites and perform actions on the behalf of the logged users like exporting the user address book or sending invitation to a group.
    This recommendation solves the problem.
    This recommendation does not solve the problem.

    +2

    Picture of Dave Smith by Dave Smith Reputation 7620 - 7 years ago (2017-08-28) Comment

    It is a multi-part process. First you need to scrape the website and retrieve the links, which is fairly easy. Then you can use this class to send http requests to the linked sites and capture the response to check if they are returning a good request.

    • 3 Comments
    • 1. Picture of Melanie Wehowski by Melanie Wehowski - 7 years ago (2017-08-30) Reply

      I agree with Dave Smith to recommend https://www.phpclasses.org/package/3-PHP-HTTP-client-to-access-Web-site-pages.html for testing the http response code, you can fetch only the headers and check for the response code? To do the first task, fetching the links, I would recommend:

      • either php.net/manual/de/class.domdocument.php
      • or (handling invalid HTML) simplehtmldom.sourceforge.net/
      • or just a simple REGEX:

        $regexp = "<a\s[^>]href=(\"??)([^\" >]?)\\1[^>]>(.)<\/a>"; preg_match_all("/$regexp/siU", $this->content, $matches);

    • 2. Picture of Melanie Wehowski by Melanie Wehowski - 7 years ago (2017-08-30) in reply to comment 1 by Melanie Wehowski Reply

      Somehow the regex in my answer was broken by the site, here it is as gist gist.github.com/wehowski/afc811cb4eb727e97e2a75b1b9d3e3c6

    • 3. Picture of Axel Hahn by Axel Hahn - 7 years ago (2017-10-06) Reply

      I agree this too :-)

      For a single webpage you can fetch it (with curl), then parse it (with DOM or regex) to get all links (can be in tags a, iframe, img, link, style, source, ...) and then check these.

      To check a complete website you need a bit more, because you don't want to check each link only once, keep all results in a database. This cannot (should not) do a single class.

      I currently write my own crawler and ressource checker with web browser interface, but it is still beta (and not linked in my phpclasses projects yet).


    Recommend package
    : 
    :