Article 5883 of comp.lang.perl: Xref: feenix.metronet.com news.software.nntp:2420 comp.sources.testers:102 alt.sources:1879 comp.lang.perl:5883 Path: feenix.metronet.com!news.utdallas.edu!hermes.chpc.utexas.edu!cs.utexas.edu!uunet!spool.mu.edu!sgiblab!idiom.berkeley.ca.us!idiom.berkeley.ca.us!not-for-mail From: muir@idiom.berkeley.ca.us (David Muir Sharnoff) Newsgroups: news.software.nntp,comp.sources.testers,alt.sources,comp.lang.perl Subject: fetch-news a personal news transfer engine -- testers neeeded. Followup-To: news.software.nntp Date: 15 Sep 1993 00:03:33 -0700 Organization: Idiom Consulting / Berkeley, CA USA Lines: 1426 Distribution: world Message-ID: <276eo5$n9g@idiom.berkeley.ca.us> NNTP-Posting-Host: idiom.berkeley.ca.us Okay, I've been using the current version my fetch-news to get my newsfeed for all of 30 minutes now! Anyway, I think it works (I've been pounding on it hard for the last few days). IF having a newsfeed of only the news that you actually read appeals to you (think, you can run with a 20-day expire and not eat a gigabyte!) AND if you are on the internet AND if you are willing to use beta software THEN please give fetch-news a spin! I won't send say anything about fetch-news again until testing is complete -- if you are interested, act now and SAVE! -Dave FETCH-NEWS fetch-news - pull the news that will be read via nntp SYNOPSIS fetch-news [-force days] [-debug] [-verbose] [-config config_file] [PARAMETER=value]... DESCRIPTION fetch-news will use the NNTP to transfer news between two news servers. It only transfers the news that it believes people will want to read. It knows what people want to read by looking at their fetched. Unlike some other similar programs, fetch-news will keep track of article numbers rather than using the newnews com- mand. This is important because with most news implementa- tions (INN in particular) the newnews command thrahes the server. OPTIONS There are only a few command line options to fetch-news, but there are many configuration options. Since fetch-news is a perl program distributed in source, the bulk of the confi- guration options are specified in the program text itself just after the copyright notice. You do not need to change there! One of the parmateters, $CONFIG, is the location of a file to require before doing much else. This file will be read before much else is evaluated. Set your overrides there. The name of this file can be specified on the com- mand line with -config filename. The other options are -force days which will override the touch files and cause days worth of news to be transfered. You can get more verbose output by specifying -verbose, or -debug. Finally, any of the configuration parameters (see the source) can be overriden on the command line with an assig- ment: command=value. NEWS SERVER SETUP fetch-news is asymetrical in how it talks to news servers. It talks to the news server that it is transfering news from as if it were a news reader. It talks to the server that it is transfering news to as if it were another news server. This difference can make configuration annoying. However, it's easy to test: if you can read news with a news reader, you can get the news with fetch-news (if you don't have per- mission, just use a packet forwarder). One implication is that with fetch-news, you tend get your news from one server and send your outgoing news to another. USE WITH SLOW LINKS I built fetch-news mostly to optimize use of my slow (SLIP) internet connection. It saves my line by only transfering the news that is likely to be read. In addition, I use a small tcp socket forwarder process to introduce delay. In my forwarded process, I have it use a small (128-byte) socket send buffer so that a round-trip delay is required for every packet. To get my socket forwarder, ftp to idiom.berkeley.ca.us and grab pub/muir-programs/tcp_nest.c. SIMILAR SYSTEMS There are quite a few news transfer systems that fill a similar role to fetch-news. None of them do exactly the same thing. Here are some pointers to a few... Newsfeed by Vernon C. Hoxie , does about what fetch-news does, but it does with batches. Gup (newsGroup Update Program) by Mark Delany and Andrew Herbert provides a way to update a news subscription file by mail. SLNR (Simple Local News Reader) by Philippe Goujard is an off-line newsreader. It grabs your news all at once so that you can read it later. And there is one more system, recently posted, that is very similar to fetch-news except that is uses the newnews com- mand to transfer news. I apologize for forgetting the name. AVAILABILITY The latest version of fetch-news is available through anonymous ftp to idiom.berkeley.ca.us. There is a small mailing list for discussing fetch-news. Send mail to fetch-news-list-request@idiom.berkeley.ca.us to join it. AUTHOR David Muir Sharnoff ------------------- cut here --------------------------------- #!/bin/sh # shar: Shell Archiver (v1.22) # # Run the following text with /bin/sh to create: # fetch-news # sed 's/^X//' << 'SHAR_EOF' > fetch-news && X#!/usr/local/bin/perl X'di'; X'ig00'; X# THIS PROGRAM IS ITS OWN MANUAL PAGE. INSTALL IN man & bin. X X# $Revision: 1.4 $ X# $Author: muir $ X X# TODO: X# distributions line X# qerry server for newsgroup requests X# option to insert news with inews X# rethink group removal X X############################################################################# X# X# Copyright (c) 1993 David Muir Sharnoff X# All rights reserved. X# X# Redistribution and use in source and binary forms, with or without X# modification, are permitted provided that the following conditions X# are met: X# 1. Redistributions of source code must retain the above copyright X# notice, this list of conditions and the following disclaimer. X# 2. Redistributions in binary form must reproduce the above copyright X# notice, this list of conditions and the following disclaimer in the X# documentation and/or other materials provided with the distribution. X# 3. All advertising materials mentioning features or use of this software X# must display the following acknowledgement: X# This product includes software developed by the David Muir Sharnoff. X# 4. The name of David Sharnoff may not be used to endorse or promote products X# derived from this software without specific prior written permission. X# X# THIS SOFTWARE IS PROVIDED BY THE DAVID MUIR SHARNOFF ``AS IS'' AND X# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE X# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE X# ARE DISCLAIMED. IN NO EVENT SHALL DAVID MUIR SHARNOFF BE LIABLE X# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL X# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS X# OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) X# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT X# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY X# OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF X# SUCH DAMAGE. X# X# This copyright notice derrived from material copyrighted by the Regents X# of the University of California. X# X############################################################################# X X# X# This is fetch-news. X# X# It transfers news by pretending to read it and then shoving what X# it gets to another server. X# X# It does all the news transfering with NNTP. X# X# On systems that support XHDR, it is fairly efficient. X# X# If you have any questions, send mail to X# fetch-news-mail@idiom.berkeley.ca.us X# X# It creates and deletes news groups using ctlinnd, so it works X# best on systems that use INN. X# X# X# The development of fetch-news has been paid for by David Muir X# Sharnoff and Berkeley Research and Trading. Contributions are X# welcome. X# X X$debug = 0; X$verbose = 1; X X# These can be dug out of /usr/include X$AF_UNIX = 1; X$AF_INET = 2; X$SOCK_STREAM = 1; X$SOCKADDR_IP = 'S n a4 x8'; X$SOCKADDR_UN = 'S a108'; X X# Socket addresses to grab transfer articles X# to and from. X# /xyz is a unix domain socket X# foo.bar.baz is an inet domain socket to nntp X# foo.bar.baz/p is an inet domain socet to port p X# X$TO = "/usr/local/news/innd/nntpin"; X$FROM = "prep.mit.edu/3340"; X X# Groups will only be created & destroyed if INN is being used. X# If INN is being used, then there should be a ctlinnd program. X$CTLINND = "ctlinnd"; X X# article spool X$SPOOL = "/usr/local/news/spool/news"; X X# The "active" file from INN. This can be set to "nntp" X# if the active file should be queried over nntp. X$ACTIVE = "/usr/local/news/active"; X X# list of groups not to fetch X$IGNORE = "/usr/local/news/fetch.ignore"; X X# list of groups to always fetch X$SPECIAL = "/usr/local/news/fetch.special"; X X# A command to run after finishing each newsgroup. It X# takes a newsgroup name as an argument. Optional. X$MTHREADS = "/usr/local/news/trn/mthreads"; X X# List of all newsgroups. This can be set to "nntp" if X# the list of all newsgroups should be queried over nntp. X$MASTER_ACTIVE = "/usr/local/news/active.from.uunet"; X X# days to pre-feed a new group. "everything" if -1 X$BACK_FEED = 6.0; X X# where to store the fetch counters X$FETCHSPOOL = $SPOOL; X# file extension name for fetch counters X$FETCHFILE = ".nfn"; X X# store the fetch counters in a directory tree (as X# apposed to flat, all in one directroy)? X$FETCHDIRS = 1; X X# uses timestamps or article numbers? If article numbers X# are used (much more efficient 'cause then it doesn't use X# newnews much) then the "xhdr" command must be supported. X$TIMESTAMP = 0; X X# Maximum number of new newsgroups that a single user can get. X$MAX_NEWGROUPS = 20; X X# Maximum number of newsgroups that a single user can get. X$MAX_USER = 300; X X# Maximum number of articles to fetch that might not be read X# on behalf of a single user X$MAX_USER_ARTICLES = 100; X X# The login of the account that runs the news software X$NEWS_USER = "news"; X X# If nothing has happened for this many seconds, exit X# If waiting for a lock that hasn't been touched in this long, X# take it. X$MAX_IDLE_TIME = 3000; X X# Make sure to touch your lock withing this time... X$MIN_IDLE_TIME = 300; X X# Amount of time to sleep after killing off another fetch-news process X$SLEEP_AFTER_KILL = 40; X X# Command to run before grabbing news X$START_TUNNEL = undef; X X# Ignore .newsrc files that haven't been modifed in this X# many days. X$IGNORE_UNTOUCHED_NEWSRC = 20.0; X X# Get rid of groups that nobody even lists in their .newsrc files X$REMOVE_UNTAKEN_GROUPS = 0; X X# "require" this file -- a place to store configuration information. X$CONFIG = "/usr/local/news/fn.config"; X X# The master lock file. Set this to only have one copy of X# fetch news running at a time. X$MASTER_LOCK = "/usr/local/news/locks/fn.master"; X X# Which newsgroups should be ignored. Wildcard is *, negate is !, terms X# are evaluated left to right with the last match winning. X# Thus "*,!alt.*,alt.sex.*" means look at all groups except alt groups X# other than alt.sex. X#$RESTRICTION = '!*.answers,!*pictures*,!*binaries*'; X$RESTRICTION = ''; X X# Maximum expire time. Any touch files older than this will be X# discarded. X$MAX_EXPIRE = 20; X X###################################################################### X X$o0 = "$0 @ARGV"; X$av0 = $0; X$usage = "Usage: $av0 [-force days] [-debug] [-verbose] [-config CONFIG_FILE] [PARAMETER=value]"; X Xwhile(@ARGV) { X $a = shift(@ARGV); X if ($a eq "-force") { X $force = 1; X $force_days = shift(@ARGV); X next; X } X if ($a eq "-config") { X $CONFIG = shift(@ARGV); X require $CONFIG || die "require $CONFIG: $@"; X undef $CONFIG; # don't get it later X next; X } X if ($a eq "-debug") { X $debug = !$debug; X next; X } X if ($a eq "-verbose") { X $verbose = !$verbose; X next; X } X if ($a =~ m/^([A-Z_0-9]+)\=(.+)/) { X $x = $2; X print "eval \$$1 = \$x\n" if $debug; X eval "\$$1 = \$x"; X die "Eval \$$1 = \$x: $@" if $@; X next; X } X die $usage; X} X Xif ($CONFIG) { X local($0) = "$o0: require $CONFIG"; X require $CONFIG || die "require $CONFIG: $@"; X} X X$SIG{USR1} = 'check_alive'; X X$0 = "$o0: get lock on $MASTER_LOCK"; Xif ($MASTER_LOCK) { X die "could not get lock $MASTER_LOCK" X unless &lock($MASTER_LOCK); X} X X$0 = "$o0: lookup homedir of $NEWS_USER"; X($name,$passwd,$uid,$gid,$quota,$comment,$gcos,$dir,$shell) = getpwnam($NEWS_USER); X Xchdir($dir) || die "chdir $dir: $!"; X X$ENV{'PATH'} = ".:$dir/bin:$dir/etc:$dir/local/bin:/bin:/usr/bin:/etc:/usr/local/bin:/usr/ucb"; X X$SIG{USR2} = IGNORE; X X$0 = "$o0: `$START_TUNNEL`"; X&system($START_TUNNEL) if $START_TUNNEL; X&still_alive(); X X$0 = "$o0: building restriction routine"; X&restrict_init if $RESTRICTION; X X$0 = "$o0: process ignore"; X%ignore = (); X$ignore{'junk'} = 1; Xif ($IGNORE && -e $IGNORE) { X open(IGNORE,$IGNORE) || die "open $IGNORE: $!"; X while() { X chop; X print STDERR "Ignore: $_\n" if $debug; X $ignore{$_} = 1; X } X &still_alive(); X} X X$0 = "$o0: process special"; X%special = (); X$special{'control'} = 1; Xif ($SPECIAL && -e $SPECIAL) { X open(SPECIAL,$SPECIAL) || die "open $SPECIAL: $!"; X while() { X chop; X print STDERR "Special: $_\n" if $debug; X $special{$_} = 1; X $who{$_} .= "news "; X $no_limit{$_} = 'special'; X } X &still_alive(); X} X%take = %special; X X$0 = "$o0: open sockets"; X&open_connections(); X X$0 = "$o0: read master active file"; X%moderation = (); Xif ($MASTER_ACTIVE eq 'nntp') { X local($line_count); X print STDERR "Fetching the master active file\n" if $debug; X &fs("list active"); X $resp = &getline(FS); X if ($resp =~ /^215/) { X while($resp = &getline(FS)) { X &chop($resp); X last if $resp =~ /^\.$/; X next unless m/(\S+)\s\d+\s\d+\s(.)/; X $moderation{$1} = $2; X &still_alive() if ($line_count++ % 200 == 0); X } X } X &still_alive(); X} elsif ($MASTER_ACTIVE) { X print STDERR "reading the master active file, $MASTER_ACTIVE\n" if $debug; X &system("$CTLINND reload active want_to_see_it") if $CTLINND; X open(MA,$MASTER_ACTIVE) X || die "could not open $MASTER_ACTIVE: $!"; X while() { X next unless m/(\S+)\s\d+\s\d+\s(.)/; X $moderation{$1} = $2; X } X close(MA); X &still_alive(); X} X X$0 = "$o0: read active file"; Xprint STDERR "Reading the active file\n" if $debug; X%weget = (); X%mina = (); X%maxa = (); X@weget = (); Xif ($ACTIVE eq 'nntp') { X &ts("list active"); X local($line_count); X $resp = &getline(TS); X if ($resp =~ /^215/) { X while($resp = &getline(TS)) { X &chop($resp); X last if $resp =~ /^\.$/; X next unless m/(\S+)\s(\d+)\s(\d+)/; X $weget{$1} = 1; X $maxa{$1} = $2; X $mina{$1} = $3; X $current{$1} = ($2+1 != $3); X push(@weget,$1); X &still_alive() if ($line_count++ % 200 == 0); X } X } X &still_alive(); X} else { X open(ACTIVE,$ACTIVE) || die "could not open $ACTIVE: $!"; X while() { X next unless m/(\S+)\s(\d+)\s(\d+)/; X $weget{$1} = 1; X $maxa{$1} = $2; X $mina{$1} = $3; X $current{$1} = ($2 != $3); X push(@weget,$1); X } X close(ACTIVE); X &still_alive(); X} X X$0 = "$o0: go through the accounts"; Xprint STDERR "Running through the accounts\n" if $debug; X%mentioned = (); Xwhile (($name,$passwd,$uid,$gid,$quota,$comment,$gcos,$dir,$shell) = getpwent) { X X $0 = "$o0: look at ${name}'s directory"; X X $newcount = 0; X $taken = 0; X $report = 0; X $mentioned = 0; X X if (-e "$dir/.rnfetchspecial" && open(FETCHSPECIAL,"$dir/.rnfetchspecial")) { X while() { X chop; X next if /^\S+\s\d+\s\d+\sn/; X next unless /^([^:!]+)/; X $group = $1; X $mentioned{$1} += 1; X $mentioned++; X next if /^[^:!]+!/; X unless (&touch_file_okay($group) || $current{$group}) { X $newcount++; X } X $take{$group} += 1; X $who{$group} .= "$name "; X $taken++; X print STDERR "Fetchspecail($name) $group\n" if $debug; X } X close(FETCHSPECIAL); X $report = 1; X &still_alive(); X } X X if (-e "$dir/.newsrc" && (-M "$dir/.newsrc" < $IGNORE_UNTOUCHED_NEWSRC)) { X unless (open(NEWSRC,"$dir/.newsrc")) { X print STDERR "Could not open $dir/.newsrc: $!"; X if ($! =~ /ermission/) { X # send mail to $name ... X } X } else { X %un = (); X while() { X next unless /^(.+)([:!])\s*(.*)/; X ($group,$read,$on) = ($1,$3,$2); X X $RESTRICTION && &restrict($group) && next; X X $thisnew = 0; X $thisold = 0; X X $mentioned{$group} += 1; X $mentioned++; X X next unless ($on eq ":"); X X if ($read) { X if ($read =~ /(\d+)$/) { X $lasta = $1; Xprint STDERR "$group .. read = $read lasta = $1 current = $current{$group} weget = $weget{$group} mina = $mina{$group} maxa = $maxa{$group}\n" if $debug; X if ($current{$group} && ($weget{$group} && ($lasta+1 < $mina{$group} || $lasta > $maxa{$group}))) { X # this group isn't really read X next; X } X $thisold = 1; X } else { X $thisnew = 1; X } X } else { X # if the user hasn't read the group it might X # be because it hasn't been any news in it. X unless (&touch_file_okay($group) && ! $current{$group}) { X $thisnew = 1; X } X } X X $taken++; X $newcount++ if $thisnew; X X $no_limit{$group} = $name if $thisold; X X if ($newcount == $MAX_NEWGROUPS) { X print STDERR "Enough new groups for $name\n"; X } X X if ($taken == $MAX_USER) { X print STDERR "Enough groups for $name\n"; X } X X unless (($newcount > $MAX_NEWGROUPS) || ($taken > $MAX_USER)) { X $take{$group} += 1; X $who{$group} .= "$name "; X print "Take($name) $group\n" if $debug; X if ($thisnew) { X $is_new{$group} = $name; X $un{$group} = 1; X } X } X } X close(NEWSRC); X $report = 1; X X $taken{$name} = $taken; X $newcount{$name} = $newcount; X X if ($newcount && $MAX_USER_ARTICLES/$newcount < 30) { X $mpg = 1+int($MAX_USER_ARTICLES X /($newcount > $MAX_NEWGROUPS ? $MAX_NEWGROUPS : $newcount)); X X for $un (keys %un) { X $limit{$un} = $mpg if $mpg > $limit{$un}; X print STDERR "Limiting $un to $limit{$un} for $name.\n" X if $verbose; X } X if (($maxa{$un} - $mina{$un} > $limit{$un}*2) && ! $no_limit{$un}) { X $take{$group} -= 1; X $taken--; X } X } X } X &still_alive(); X } X print STDERR "$name takes $taken ($newcount new) out of $mentioned\n" if $report; X} X X$0 = "$o0: add new groups"; Xif ($CTLINND && $MASTER_ACTIVE) { X print STDERR "Adding new groups ...\n" if $debug; X foreach $group (sort keys %mentioned) { X unless ($weget{$group}) { X if ($moderation{$group}) { X $RESTRICTION && &restrict($group) && next; X print STDERR "New group: $group\n"; X ($gd = $group) =~ s!\.!/!g; X $pre = (-d "$SPOOL/$gd"); X &system("$CTLINND newgroup $group $moderation{$group} $name"); X &system("$CTLINND renumber $group") X if $pre; X $weget{$group} = 1; X } else { X print STDERR "Group $group not listed in master active file\n"; X } X } X &still_alive(); X } X} X X$0 = "$o0: remove old groups"; Xif ($REMOVE_UNTAKEN_GROUPS && $CTLINND) { X print STDERR "Removing old groups...\n" if $debug; X foreach $group (@weget) { X $RESTRICTION && &restrict($group) && next; X next if $take{$group}; X next if $ignore{$group}; X next if $special{$group}; X next if $mentioned{$group}; X print STDERR "rmgroup $group\n"; X &system("$CTLINND rmgroup $group"); X &still_alive(); X } X} X X# X# Free up memory space -- these arrays are no longer needed. X# X%mentioned = (); X%special = (); X%moderation = (); X%weget = (); X@weget = (); X%mina = (); X%maxa = (); X%current = (); X X$0 = "$o0: fetch articles"; X Xlocal($f_first, $f_last,*TF); Xlocal($time,$artnum); Xlocal($ts,$gd,$n,$tfe); Xlocal($sec,$min,$hour,$mday,$mon,$year,$wday,$isdst); Xlocal($id,%ids,$begin,$end,$range); Xlocal($attempted,$transfered); Xlocal($line_counter); X Xforeach $group (sort keys %take) { X $0 = "$o0: consider group $group"; X# next unless $group eq 'rec.humor.funny'; X $RESTRICTION && &restrict($group) && next; X next unless ($take{$group} > 0); X next if $ignore{$group}; X X $0 = "$o0: process group $group"; X X undef $limit; X if ($limit{$group} && ! $no_limit{$group}) { X $limit = $limit{$group}; X } X X print STDERR "$group ... $who{$group} ($limit)\n" if $verbose; X X $attempted = 0; X $transfered = 0; X X $tf = &touch_file($group); X X print STDERR "Touch (count) file: $tf\n" if $debug; X X unless (&lock($tf)) { X print STDERR "Locked: $group\n"; X next; X } X X GROUP: { X # X # figure out the set of articles to fetch. X # this can either be based on time (set $time) X # or on the remote article number (set $artnum) X # X X &still_alive(); X undef $time; X undef $artnum; X X if ($force) { X $time = time - ($force_days * 60*60*24); X } elsif (&touch_file_okay($group)) { X if ($TIMESTAMP) { X $time = (stat($tf))[9]; X } else { X $artnum = &readfile($tf); X } X } else { X if ($TIMESTAMP || $BACK_FEED != -1) { X $time = time - ($BACK_FEED * 60*60*24); X } else { X $artnum = 0; X } X } X X X print STDERR "time: $time artnum: $artnum\n" if $debug; X X # interaction: X # group news.answers X # 211 808 6976 9924 news.answers X # X X $0 = "$o0: ($time/$artnum/$limit) send 'group $group'"; X X &fs("group $group"); X $resp = &getline(FS); X if ($resp =~ /^211\s+\d+\s+(\d+)\s+(\d+)\s/) { X ($f_first, $f_last) = ($1, $2); X X %ids = (); X undef $range; X X if ($time) { X # interaction: X # newnews news.answers 930722 140733 GMT X # 230 New news since 850515 020000 follows X # <1772@foo.UUCP> X # . X print STDERR "Using newnews to backfeed...\n" X if $verbose && ! $TIMESTAMP; X ($sec,$min,$hour,$mday,$mon,$year,$wday,$isdst) = gmtime($time); X X $0 = "$o0: ($group/$time/$artnum/$limit) send newnews"; X &fs(sprintf("newnews %s %02d%02d%02d %02d%02d%02d GMT",$group,$year,$mon+1,$mday,$hour,$min,$sec)); X $resp = &getline(FS); X &still_alive(); X if ($resp =~ /^230\s/) { X $0 = "$o0: ($group/$time/$artnum/$limit) recv newnews"; X while($resp = &getline(FS)) { X &chop($resp); X last if $resp =~ /^\.$/; X $ids{$resp} = $resp; X &still_alive if ($line_counter++ % 200 == 0); X } X } else { X die "Illegal responce from server: $resp"; X } X &still_alive(); X } else { X # interaction: X # xhdr Message-ID 120716-120717 X # 221 Message-ID fields follow X # 120716 <1993Jul6.055613.21351@news.dkrz.de> X # 120717 <1993Jul6.055613.21352@news.dkrz.de> X # . X $artnum = 0 if $artnum > $f_last; # whoops! X $begin = ( $f_first > ($artnum+1) ? $f_first : ($artnum+1) ); X $end = $f_last; X X if ($end - $begin > 0) { X $range = "$begin-$end"; X } elsif ($end == $begin) { X $range = $end; X } elsif ($end - $begin == -1) { X $range = 0; # nothing to do... X } else { X # wheee, things is backwards! X die "Range of articles for $group is impossible"; X } X X print STDERR "artnum $artnum begin $begin end $end range $range\n" if $debug; X X if ($range) { X $0 = "$o0: ($group/$time/$artnum/$limit) send 'xhdr $range'"; X &still_alive(); X &fs("xhdr Message-ID $range"); X $resp = &getline(FS); X if ($resp =~ /^221\s/) { X while($resp = &getline(FS)) { X $0 = "$o0: ($group/$time/$artnum/$limit) recv xhdr"; X &chop($resp); X last if $resp =~ /^\.$/; X if ($resp =~ /^(\d+)\s+(\S+)/) { X $ids{$2} = $1; X } else { X die "Could not parse output from server: $resp"; X } X &still_alive if ($line_counter++ % 200 == 0); X } X } else { X die "Cannot use xhdr: $resp"; X } X &still_alive(); X } X } X X # X # Okay, now we've got the articles to present in %ids X # so let's run 'em by the TO server. X # X X # interaction: X # X # C: IHAVE <4105@ucbvax.ARPA> X # S: 435 Already seen that one, where you been? X # X # (client offers another article) X # C: IHAVE <4106@ucbvax.ARPA> X # S: 335 News to me! to end. X # C: (sends article) X # C: . X # S: 235 Article transferred successfully. Thanks. X # X # (or) X # X # S: 436 Transfer failed. X X # 235 article transferred ok X # 335 send article to be transferred. End with . X # 435 article not wanted - do not send it X # 436 transfer failed - try again later X # 437 article rejected - do not try again X Xprint "I am at ".__LINE__."\n" if $debug; X X %done = (); X $group_count = 0; X for $id (keys %ids) { X &still_alive(); X if ($limit && $group_count > $limit) { X print STDERR "Transfer stopped by group count limit\n" X if $verbose; X last; X } X $attempted += 1; X $0 = "$o0: ($group/$time/$artnum/$limit) send ihave $id"; X &ts("ihave $id"); X $resp = &getline(TS); X if ($resp =~ /^435\s/) { X # skip that one... X $done{$id} = 'duplicate'; X } elsif ($resp =~ /^335\s/) { X # is a winner!!! ding ding ding. X $0 = "$o0: ($group/$time/$artnum/$limit) send article $id"; X &fs("article $ids{$id}"); X $resp = &getline(FS); X if ($resp =~ /^220\s/) { X $0 = "$o0: ($group/$time/$artnum/$limit) xfer $id"; X &still_alive(); X $line_counter = 0; X while($resp = &getline(FS)) { X &chop($resp); X &ts($resp); X last if $resp =~ /^\.$/; X &still_alive if ($line_counter++ % 200 == 0); X } X $resp = &getline(TS); X if ($resp =~ /^235\s/) { X # way cool! X print "$id\n" if $verbose; X $group_count += 1; X $transfered += 1; X $done{$id} = 'transfered'; X } elsif ($resp =~ /^436\s/) { X # try again later... X print STDERR "Told to try again\n"; X &ts("quit"); X &fs("quit"); X exit(0); X } elsif ($resp =~ /^437\s/) { X $done{$id} = 'rejected'; X # rejected, sigh. X } else { X die "Unknown responce: $resp"; X } X &still_alive(); X } elsif ($resp =~ /^(423|430)\s/) { X # ooops, it's gone! X print STDERR "Article $id in $group disappeared on me\n"; X close(TS); X &openTS(); X &still_alive(); X } else { X die "Weird response: $resp"; X } X } else { X die "Unexpected code: $resp"; X } X } X X $good_art = 0; X X if ($time && ! $TIMESTAMP) { X # X # This should only be true if we just tried to backfeed X # a certain number of days. X # X # Now that we've got a partial set of articles, we want X # to note what they are so that we can properly set the X # touch file. This is kinda tricky. X # X if (%ids) { X $0 = "$o0: ($group/$time/$artnum/$limit) calculate seq number: send xhdr"; X &still_alive(); X $line_counter == 0; X &fs("xhdr Message-ID $f_first-$f_last"); X $resp = &getline(FS); X if ($resp =~ /^221\s/) { X $0 = "$o0: ($group/$time/$artnum/$limit) calculate seq number: recv xhdr"; X while($resp = &getline(FS)) { X &chop($resp); X last if $resp =~ /^\.$/; X if ($resp =~ /^(\d+)\s+(\S+)/) { X if ($done{$2}) { X $good_art = $1; X } X } else { X die "Could not parse output from server: $resp"; X } X &still_alive if ($line_counter++ % 200 == 0); X } X } else { X die "Cannot use xhdr: $resp"; X } X unless ($good_art) { X print STDERR "Did not manage to transfer antyhing, restting counter for $group!\n"; X } X print STDERR "Reviewing, we find we are up to $good_art\n" if $verbose; X } else { X $good_art = $f_last; X } X } X X &still_alive(); X X $tfe = -e $tf; X if (! $tfe || ! $time) { X open(TF,">$tf") || die "open >$tf: $!"; X $n = ($time ? ($good_art ? $good_art : 0) : $end); X $0 = "$o0: set $tf = $n"; X (print TF "$n\n") || die "write $tf: $!"; X } else { X open(TF,">>$tf") || die "open >$tf: $!"; X } X ($sec,$min,$hour,$mday,$mon,$year,$wday,$isdst) = localtime($time); X printf(TF "\n%d/%d/%02d $hour:$min --",$mon+1,$mday,$year,$hour,$min); X print TF " $attempted attempted, $transfered suceeded\n"; X close(TF) || die "close $tf: $!"; X } elsif ($resp =~ /^411/) { X print STDERR "$FROM doesn't carry $group\n"; X open(TF,">$tf") || die "open >$tf: $!"; X (print TF "0\n") || die "write $tf: $!"; X (print TF "\n$FROM doesn't carry $group\n"); X close(TF) || die "close $tf: $!"; X } X } X X print STDERR "... $attempted offered, $transfered transfered\n" X if ($attempted || $transfered); X $0 = "$o0: ($n) run $MTHREADS on $group"; X &system("$MTHREADS $group") X if ($MTHREADS && $transfered); X &still_alive(); X X &unlock($tf); X} X X&unlock($MASTER_LOCK) X if $MASTER_LOCK; X Xexit(0); X Xsub open_connections X{ X # X # Internal notation: f - from, t - to. Indicating X # the flow of news. So, FS is the socket connected X # to the news server that articles are retrieved X # from and TS is connected to the server that articles X # are sent to. X # X X # X # Open a socket to the NNTP server that news will be X # transfered TO. X # X sub openTS { X &socket(TS,$TO); X local($0) = "$o0: get first line from $TO"; X $x = &getline(TS); X die "To Server: $x" unless $x =~ /^200\s/; X } X &openTS(); X X # X # Open a socket to the NNTP sever that news will be X # transfered FROM. X # X &socket(FS,$FROM); X local($0) = "$o0: get first line from $FROM"; X $x = &getline(FS); X die "To Server: $x" unless $x =~ /^20[01]\s/; X} X X# The lock-keeping mechenism only owrks if the processes are owned X# by the same user. Xsub lock X{ X local($tf) = @_; X local($rl,$rl2); X X local($0) = "$o0: get lock on $tf"; X X symlink("$$","$tf.lock"); X $rl = readlink("$tf.lock"); X if ($rl && $rl != $$) { X $0 = "$o0: get lock on $tf (no alive file, sleeping)"; X sleep(60) if (! -e "$tf.alive"); X if (-e "$tf.alive") { X if (time - (stat("$tf.alive"))[9] > $MAX_IDLE_TIME) { X $rl2 = readlink("$tf.lock"); X if ($rl2 == $rl) { X print STDERR "Breaking lock on $tf (was held by $rl)\n"; X $0 = "$o0: get lock on $tf (old alive, killing $rl)"; X kill(HUP,$rl); X sleep(60); X unlink("$tf.lock"); X unlink("$tf.alive"); X symlink("$$","$tf.lock"); X $rl = readlink("$tf.lock"); X } X } X } else { X $rl2 = readlink("$tf.lock"); X if ($rl2 == $rl) { X # $rl has been bad! X print STDERR "Process $rl has a lock, but no alive file"; X $0 = "$o0: get lock on $tf (no alive, killing $rl)"; X kill(HUP,$rl); X sleep(60); X unlink("$tf.lock"); X unlink("$tf.alive"); X symlink("$$","$tf.lock"); X $rl = readlink("$tf.lock"); X } X } X } X $locks{$tf} = 1; X if ($rl == $$) { X &touch("$tf.alive"); X } X &still_alive(); X return ($rl == $$); X} X Xsub unlock X{ X local($f) = @_; X local($rl); X $rl = readlink("$f.lock"); X if ($rl == $$) { X unlink("$f.lock"); X unlink("$f.alive"); X } X delete $locks{$tf}; X} X X# keep touching the .alive file to show that we are still doing stuff. X# if we ever stop, or doing do it for long enough, then we might be X# killed by another process or our lock may be stolen. Xsub still_alive X{ X local($t); X $t = time; X X if ($last_touch && ($t - $last_touch > $MAX_IDLE_TIME)) { X # oops, we've been idle for TOO LONG X die "Idle too long"; X } elsif ($t - $last_touch > $MIN_IDLE_TIME) { X local($l); X for $l (keys %locks) { X &touch("$l.alive"); X } X $last_touch = $t; X } X} X Xsub touch X{ X local($t) = @_; X local(*T); X open(T,">$t") || die "open $t: $!"; X (print T "FOOBAR!!\n") || die "write $t: $!"; X close(T) || die "close $t: $!"; X} X Xsub fs X{ X local($s) = @_; X local($o); X X if ($debug) { X ($o = $s) =~ s/(\S)/\1\b\1/g; X print STDERR "$o\n"; X } X (print FS "$s\r\n") || die "socket FS closed!"; X} X Xsub ts X{ X local($s) = @_; X local($u); X X if ($debug) { X ($u = $s) =~ s/(\S)/_\b\1/g; X print STDERR "$u\n"; X } X (print TS "$s\r\n") || die "socket TS closed!"; X} X Xsub touch_file_okay X{ X local($group) = @_; X X local($tf) = &touch_file($group); X X if (-e $tf && (-M $tf) < $MAX_EXPIRE) { X return 1; X } X return 0; X} X Xsub touch_file X{ X local($group) = @_; X X local($gd,$tf); X $gd = $group; X ($gd =~ s!\.!/!g) if $FETCHDIRS; X $tf = "$FETCHSPOOL/$gd/$FETCHFILE"; X return $tf; X} X Xsub getline X{ X local($S) = @_; X local($resp); X X $resp = <$S>; X die "connection to $S closed" if !defined($resp); X print STDERR "$S: $resp" if $debug; X return $resp; X} X Xsub socket X{ X local($S,$a) = @_; X X local($0) = "$o0: open socket to $a"; X print STDERR "Open $S: $a\n" if $debug; X if ($a =~ m,^/,) { X local($socket_path) = $a; X local($sa); X X $sa = pack($SOCKADDR_UN, $AF_UNIX, $socket_path); X socket($S, $AF_UNIX, $SOCK_STREAM, 0) X || die "socket: $!"; X connect($S, $sa) || die "connect: $!"; X } else { X local($host,$port); X if ($a =~ m,(.+)/(.+),) { X ($host, $port) = ($1, $2); X } else { X ($host, $port) = ($a, 'nntp'); X } X local($name,$aliases,$proto,$type,$len); X local($this,$that,$thisaddr,$thataddr); X X chop($hostname = `hostname`) X unless $hostname; X die "Could not get hostname" X unless $hostname; X ($name,$aliases,$proto) = getprotobyname('tcp'); X ($name,$aliases,$port) = getservbyname($port,'tcp') X unless $port =~ /^\d+/; X local($0) = "$o0: open socket to $a (gethostbyname)"; X ($name,$aliases,$type,$len,$thisaddr) = gethostbyname($hostname); X $this = pack($SOCKADDR_IP, $AF_INET, 0, $thisaddr); X ($name,$aliases,$type,$len,$thataddr) = gethostbyname($host); X die "Could not get host addr for $host" X unless $thataddr; X local($0) = "$o0: open socket to $a (connect)"; X $that = pack($SOCKADDR_IP, $AF_INET, $port, $thataddr); X socket($S, $AF_INET, $SOCK_STREAM, $proto) X || die "socket: $!"; X bind($S, $this) X || die "bind $hostname,0: $!"; X connect($S, $that) X || die "connect $host: $!"; X } X select((select($S),$| = 1)[0]); # don't buffer output to S X print STDERR "$S is open\n" if $debug; X} X Xsub readfile X{ X local($file) = @_; X local(@r,*F); X X open(F,$file) || die "open $file: $!"; X @r = (wantarray ? : scalar()); X close(F); X return @r; X} X Xsub system X{ X print STDERR "+ @_\n" if $verbose; X system @_; X} X Xsub chop X{ X local($x); X X $x = chop($_[0]); X if ($x ne "\n") { X $_[0] .= $x; X return; X } X $x = chop($_[0]); X X return if $x eq "\r"; X X $_[0] .= $x; X} X X# Build the &restrict($group) subroutine. X# &restrict should return 1 for groups that should be skipped. Xsub restrict_init X{ X local($t,$rev,$revrev); X X local($e); X X $e = "sub restrict\n"; X $e .= "{\n"; X $e .= "\tlocal(\$g) = \@_;\n"; X $e .= "\n"; X X for $t (reverse split(',',$RESTRICTION)) { X $rev = ($t =~ s/^\!//); X X $t =~ s/\./\\./g; X $t =~ s/\*/.*/g; X X $e .= "\treturn $rev if \$g =~ /$t/;\n"; X } X X $revrev = !$rev; X $e .= "\treturn $revrev;\n"; X X $e .= "}\n"; X $e .= "1;\n"; X X print "RESTRICT:\n$e" if $debug; X eval $e; X die "Eval:\n$e --- $@" if $@; X} X X# for_perl_dash_w X{ X ; X $USR1; X $USR2; X &check_alive; X} X X################### BEGIN PERL/TROFF TRANSITION X.00; X X'di \\ " finish diversion--previous line must be blank X.nr nl 0-1 \\ " fake up transition to first page again X.nr % 0 \\ " start at page 1 X'; __END__ X.\" ############### END PERL/TROFF TRANSITION X.TH FETCH-NEWS 1 "March 11, 1993" X.AT 3 X.SH FETCH-NEWS Xfetch-news \- pull the news that will be read via nntp X.SH SYNOPSIS X.B fetch-news X.RI [ -force X.BR days ] X.RI [ -debug ] X.RI [ -verbose ] X.RI [ -config X.BR config_file ] X\fR[\fIPARAMETER\fR=\fBvalue\fR]... X.SH DESCRIPTION X.B fetch-news Xwill use the NNTP to transfer news between two news servers. XIt only transfers the news that it believes people will want to Xread. It knows what people want to read by looking at their X.newsrc files. Only groups that are activly being read will be Xfetched. X.LP XUnlike some other similar programs, X.B fetch-news Xwill keep track of article numbers rather than using the X.B newnews Xcommand. This is important because with most news implementations X.RB ( INN Xin particular) the X.B newnews Xcommand thrahes the server. X.SH OPTIONS XThere are only a few command line options to X.BR fetch-news , Xbut there are many configuration options. Since X.B fetch-news Xis a perl program distributed in source, the bulk of the configuration Xoptions are specified in the program text itself just after the Xcopyright notice. You do not need to change there! One of the Xparmateters, X.IR $CONFIG , Xis the location of a file to X.B require Xbefore doing much else. This file will be read before much else Xis evaluated. Set your overrides there. The name of this file can Xbe specified on the command line with X.I -config X.BR filename . X.LP XThe other options are X.I -force X.B days Xwhich will override the touch files and cause X.B days Xworth of news to be transfered. XYou can get more verbose output by specifying X.IR -verbose , Xor X.IR -debug. X.LP XFinally, any of the configuration parameters (see the source) can be Xoverriden on the command line with an assigment: \fIcommand\fR=\fBvalue\fR. X.SH NEWS SERVER SETUP X.B fetch-news Xis asymetrical in how it talks to news servers. It talks to the news server Xthat it is transfering news from as if it were a news reader. It talks to Xthe server that it is transfering news to as if it were another news server. XThis difference can make configuration annoying. However, it's easy to Xtest: if you can read news with a news reader, you can get the news Xwith X.B fetch-news X(if you don't have permission, just use a packet forwarder). X.LP XOne implication is that with X.BR fetch-news , Xyou tend get your news from one server and send your outgoing news Xto another. X.SH USE WITH SLOW LINKS XI built X.B fetch-news Xmostly to optimize use of my slow (SLIP) internet connection. It Xsaves my line by only transfering the news that is likely to be read. XIn addition, I use a small tcp socket forwarder process to introduce Xdelay. In my forwarded process, I have it use a small (128-byte) Xsocket send buffer so that a round-trip delay is required for every Xpacket. To get my socket forwarder, ftp to X.I idiom.berkeley.ca.us Xand grab X.IR pub/muir-programs/tcp_nest.c . X.SH SIMILAR SYSTEMS XThere are quite a few news transfer systems that fill a similar role to X.BR fetch-news . XNone of them do exactly the same thing. Here are some pointers to a few... X.LP X.B Newsfeed Xby Vernon C. Hoxie X.RI < vern@zebra.alphacdc.COM >, Xdoes about what X.B fetch-news Xdoes, but it does with batches. X.LP X.B Gup X(newsGroup Update Program) Xby Mark Delany X.RI < markd@bushwire.apana.org.au > Xand XAndrew Herbert X.RI < andrew@werple.apana.org.au > Xprovides a way to update a news subscription file by mail. X.LP X.B SLNR X(Simple Local News Reader) Xby Philippe Goujard X.RI < pgoujard@infocom.co.uk > Xis an off-line newsreader. It grabs your news all at once Xso that you can read it later. X.LP XAnd there is one more system, recently posted, that is very similar to X.B fetch-news Xexcept that is uses the X.B newnews Xcommand to transfer news. I apologize for forgetting the name. X.SH AVAILABILITY XThe latest version of X.B fetch-news Xis available through anonymous ftp to X.IR idiom.berkeley.ca.us . XThere is a small mailing list for discussing X.BR fetch-news . XSend mail to X.I fetch-news-list-request@idiom.berkeley.ca.us Xto join it. X.SH AUTHOR X.I David Muir Sharnoff\ \ \ \ SHAR_EOF chmod 0755 fetch-news || echo "restore of fetch-news fails" exit 0