PHP Classes

PHP Sitemap XML Parser: Parse a sitemap to get the URLs of the site pages

Recommend this page to a friend!
  Info   View files Example   View files View files (5)   DownloadInstall with Composer Download .zip   Reputation   Support forum   Blog (1)    
Ratings Unique User Downloads Download Rankings
Not yet rated by the usersTotal: 139 This week: 1All time: 9,215 This week: 560Up
Version License PHP version Categories
php-sitemap-crawler 1.0.0GNU General Publi...5XML, PHP 5, SEO
Description 

Author

This class can parse a sitemap to get the URLs of the site pages.

It can take as a parameter the URL of a given sitemap.

The class loads and parse the sitemap XML file to extract the URLs of the site pages and other resources that are listed.

If the sitemap points to other sitemap files, the class also loads and parse those sitemaps, so it can return all URLs that are listed.

Innovation Award
PHP Programming Innovation award nominee
August 2021
Number 7
Sitemaps are unique resources that many sites contain to list all the URLs of the pages and other relevant site resources.

Sitemaps may be helpful to share the list of site pages with search engines like Google.

Search engines can use a sitemap to get the list of all the site's pages. This possibility may help a site to notify Google faster about newly published pages.

Sitemaps may also be useful for tools that can crawl the site pages to verify any errors.

This package can crawl a sitemap to retrieve the list of all the pages of a site. The package can be helpful to develop tools that need to crawl the site pages.

Manuel Lemos
Picture of Juraj Puchký
  Performance   Level  
Name: Juraj Puchký is available for providing paid consulting. Contact Juraj Puchký .
Classes: 17 packages by
Country: Czech Republic Czech Republic
Age: 41
All time rank: 109511 in Czech Republic Czech Republic
Week rank: 51 Up1 in Czech Republic Czech Republic Up
Innovation award
Innovation award
Nominee: 6x

Example

<?php

require_once __DIR__ . '/../vendor/autoload.php';

if(
$argc == 2) {
   
$crawler = new \BABA\Utilities\SitemapCrawler();
   
$crawler->crawleit($argv[1]);
    foreach(
$crawler->getUrls() as $url) {
        echo
"$url\n";
    }
} else {
    echo
"cravleit.php <url of your sitemap>\n";
}


Details

PHP Sitemap Crawler

Scrape list of url from sitemap

Main purpose of this library is to scrape list of url from sitemap file

Install

git clone https://github.com/sjurajpuchky/php-sitemap-crawler.git
cd php-sitemap-crawler
composer install

Examples

In folder samples you can find some basic usage of library.


# License
GPL-2.0-only

# Authors
Juraj Puchký - BABA Tumise s.r.o. <info@baba.bj>

https://www.seoihned.cz - SEO optilamizace

https://www.baba.bj - Tvorba webových stránek

https://www.webtrace.cz - Tvorba portál? a ecommerce b2b/b2c (eshop?) na zakázku

# Log
1.0.0 - first release


# Copyright
&copy; 2021 BABA Tumise s.r.o.

  Files folder image Files  
File Role Description
Files folder imagesamples (1 file)
Files folder imagesrc (1 file)
Accessible without login Plain text file composer.json Data Auxiliary data
Accessible without login Plain text file LICENSE Lic. License text
Accessible without login Plain text file README.md Doc. Documentation

  Files folder image Files  /  samples  
File Role Description
  Accessible without login Plain text file crawleit.php Example Example script

  Files folder image Files  /  src  
File Role Description
  Plain text file SitemapCrawler.php Class Class source

 Version Control Unique User Downloads Download Rankings  
 100%
Total:139
This week:1
All time:9,215
This week:560Up