Dependency of nagios-plugins-nwc-health-8.0-1.1.src.rpm
Name: perl-WWW-RobotRules
Project: openSUSE_13.2
Repository: oss
Title: database of robots.txt-derived permissions
Description:
This module parses _/robots.txt_ files as specified in "A Standard for
Robot Exclusion", at <http://www.robotstxt.org/wc/norobots.html> Webmasters
can use the _/robots.txt_ file to forbid conforming robots from accessing
parts of their web site.
The parsed files are kept in a WWW::RobotRules object, and this object
provides methods to check if access to a given URL is prohibited. The same
WWW::RobotRules object can be used for one or more parsed _/robots.txt_
files on any number of hosts.
The following methods are provided:
* $rules = WWW::RobotRules->new($robot_name)
This is the constructor for WWW::RobotRules objects. The first argument
given to new() is the name of the robot.
* $rules->parse($robot_txt_url, $content, $fresh_until)
The parse() method takes as arguments the URL that was used to retrieve
the _/robots.txt_ file, and the contents of the file.
* $rules->allowed($uri)
Returns TRUE if this robot is allowed to retrieve this URL.
* $rules->agent([$name])
Get/set the agent name. NOTE: Changing the agent name will clear the
robots.txt rules and expire times out of the cache.
Version: 6.02
Release: 8.1.3
Architecture: noarch
Size: 17.1 KB
Build Time: 2014-10-06 16:42:43 +0200 (over 9 years ago)
Provides
Symbol | Required by |
---|---|
perl(WWW::RobotRules) = 6.02 | perl-libwww-perl |
perl(WWW::RobotRules::AnyDB... | |
perl(WWW::RobotRules::InCore) | |
perl-WWW-RobotRules = 6.02-... | package-lists-openSUSE-X11-cd package-lists-openSUSE-KDE-cd package-lists-openSUSE-GNOME-cd package-lists-openSUSE-images |