NAME LaBrea::Tarpit::Get SYNOPSIS use LaBrea::Tarpit::Get; ($rv,$host,$port,$path)=parse_http_URL($url) ($handle,$host,$port,$path)=open_http(*S,$url); $rv=parse_http_response(\$buffer,\%response); $rv=short_response($url,\%response,\%content,$timeout); $line = make_line($url,$err,\%content); $rv = not_hour($file); $rv = not_day($file); $rv=auto_update($url,$file,$cur_ver,$timeout); DESCRIPTION - LaBrea::Tarpit::Get Module connects to a web site running LaBrea::Tarpit::Report::html_report.plx and retrieves a short_report as described in LaBrea::Tarpit::Report. Run "examples/web_scan.pl" from a cron job hourly or daily to update the statistics from all know sites running LaBrea::Tarpit. A report can then be generated showing the activity worldwide. # MIN HOUR DAY MONTH DAYOFWEEK COMMAND 30 * * * * ./web_scan.pl ./other_sites.txt ./tmp/site_stats See: LaBrea::Tarpit::Report::other_sites ($handle,$host,$port,$path)= parse_http_URL($url); Separate an http URL into its components input: URL of the form http://www.foo.com[:8080]/file.html https:// service is not supported returns: (undef, error message) or (file_handle,hostname,port,path) where port and path may be empty ($handle,$host,$port,$path)=open_http(*S,$url); Open connection to http target input: *S,$url [default port = 80] returns: (undef, error) on error (file_handle, hostname, port path ) on success $rv=parse_http_response(\$buffer,\%response); Parse an http server response into a hash of headers. i.e. (representative, will vary) rc => 200 msg => OK date => Wed, 24 Apr 2002 21:46:30 GMT server => Apache/1.3.22 protocol => HTTP/1.1 content-type => text/plain content-length => 92 last-modified => Wed, 24 Apr 2002 21:46:34 GMT expires => Wed, 24 Apr 2002 21:47:04 GMT connection => close content => (complete text buffer) input: \$text_in, \%response returns: true on success, %response filled false on failure NOTE: %response{rc} (server response code) %response(msg} (server messages) are ALWAYS filled with something. In the case of server failure, the cause of the failure will be inserted into %response(msg} and undef returned. $rv=short_response($url,\%response,\%content,$timeout); Fetch the short report from "$url" and place the headers in "%response", the content, parsed, in "%content". Optional "$timeout", default is 60 seocnds. %response contains http headers %content contains key => value pairs LaBrea => version Tarpit => version Report => version Util => version now => seconds since epoch (local) tz => time zone (i.e. -0700) threads => number of threads total_IPs => total IP's bw => bandwidth input: URL, # complete url i.e. www.foo.com/html_report.plx \%response, \%content, returns: false on success error message on failure $line = make_line($url,$err,\%content); Make a line of text summarizing the short report where "$err" is the return value from "short_report" Format: url threads total_IPs bw time tz version:nn:nn:nn or url error message $rv = not_hour($file); Check if the file has been accessed this hour; input: path/to/file returns: true, not current hour false if accessed this hour or non-existent or not readable $rv = not_day($file); Check if the file has been accessed this day; input: path/to/file returns: true, not accessed this day false if accessed this day or non-existent or not readable $rv=auto_update($url,$file,$cur_ver,$timeout); Update the 'other_sites.txt' file from $url on a daily basis only. input: url, # complete url to 'other_sites.txt' # http://scans.bizsystems.net/other_sites.txt file, # path to your 'other_sites.txt' cur_ver # optional current version # the current file will be opened and scanned # if this is not supplied timeout # wait for http response # default 60 seconds returns: false on success or no update needed error msg on failure EXPORT_OK parse_http_URL open_http parse_http_response short_response make_line not_hour not_day auto_update COPYRIGHT Copyright 2002 - 2004, Michael Robinton & BizSystems This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. AUTHOR Michael Robinton, michael@bizsystems.com SEE ALSO perl(1), LaBrea::Tarpit(3), LaBrea::Codes(3), LaBrea::Tarpit::Report(3), LaBrea::Tarpit::Util(3), LaBrea::Tarpit::DShield(3)