3 Basic considerations for setting up News at your site

3.2 Deciding which newsgroups to carry

If you are planning to set up a local news database and exchange news with other sites, you will need to decide what portions of the newsgroup hierarchies available to you will be carried at your site. If you choose to run News as an nntp client only, this is not really an issue, since the NNTP server will determine which groups it makes available to clients connecting from your site.

In making this decision, you should bear several factors in mind. On the one hand, in order to best serve your users, you may want to carry a broad spectrum of newsgroups, covering topics of personal as well as professional interest. On the other hand, as the size of your feed increases, so does the amount of space necessary to hold items, the CPU and I/O load associated with maintenance batch jobs, the cost of transmitting and receiving items, etc. At the time this was written (July 1994), a full feed of all hierarchies available to a typical site in the US includes over 5000 newsgroups, requires up to 7 Gbytes of disk space, and uses over an hour per day of (off hours) CPU time for maintenance. For example, here're a few quotes one site manager recently posted to the net:

From: sloane@kuhub.cc.ukans.edu (Bob Sloane)
Newsgroups: news.software.anu-news
Subject: Re: newsskim efficiency
Date: 20 Sep 93 16:12:35 GMT
Organization: University of Kansas Academic Computing Services

Others have commented on various performance enhancements that you
might try. I am running NEWS on a VAX 9210 which is pretty overloaded
with scientific computing. Over the course of a month, there is NEVER
any idle cpu time. NEWS processing accounts for about 8-10 percent of
CPU utilization. I have NEWS_MANAGER, NEWS_ROOT and NEWS_DEVICE all
on separate drives. I have six different incoming feeds of more than
2000 newsgroups, and I provide about ten outgoing full feeds, as well
as twelve partial feeds of varying sizes and NNTP service for our
campus. A normal SKIM command (done every night) takes about 4-6
hours, and a full SKIM (done once a week) take 8-12 hours depending on
other load. I have no problems keeping up with the news flow, so it is
possible.

From: sloane@kuhub.cc.ukans.edu (Bob Sloane)
Subject: Re: disk init params
Date: 25 Jul 94 14:09:20 CDT
Organization: University of Kansas Academic Computing Services

In article <1994Jul20.191110.5023@govonca.gov.on.ca>,
newsmgr@[192.197.192.2] writes:
> 1. If I wanted all groups for about 30 days, what size disk would do the
> job.

I think about 6-7 Gigabytes of total space would be enough. You would
need about 150 Megabytes per day to store the articles, plus 2 blocks
overhead for each article (INDEXF.SYS entry + NEWS.ITEMS entry), plus
whatever was wasted due to the cluster factor of the disk. With a
cluster factor of 3, we are currently getting about 50,000 articles
per day, so about 300,000 for the articles, plus 100,000 for overhead,
plus 75,000 for cluster factor totals 475,000 blocks per day that you
want to keep should be enough, or about 4-5 days per GB. Of course,
news volume is increasing daily, so by the time you get this the
reqirements may be higher. :-)

Your decision, then, must be based in part on local resources and policy regarding access to and priority for usage of those resources.

As you may already know from experience, newsgroups are organized into large groups called 'hierarchies' according to topic; names of groups within a particular hierarchy generally begin in the same way. Several hierarchies are commonly available on the net, including

You'll probably want to choose a few of the possible hierarchies to start with, and then expand or set up local hierarchies as you become more familiar with News configuration and usage patterns at your site.

previous: 3.1 Software required for News
next: 3.3 Choosing a News database configuration
Table of Contents