SUMMARY: NSR parallelism and concurrent sessions

From: Adrian Ho <adrianho_at_rotiprata.nii.ncb.gov.sg>
Date: Wed, 24 Apr 1996 19:36:20 +0800 (SGT)

Hi DU Mgrs!

I asked last Sat about the above subject matter and why my recovers were
going so slowly (original message quoted at the end of this message).
Many thanks to the following people who assisted in my hour of need:

Harald Lundberg <hl_at_tekla.fi>
Hellebo Knut <Knut.Hellebo_at_nho.hydro.com>
"Jeffrey S. Jewett" <spider_at_umd5.umd.edu>
Paul Rockwell <rockwell_at_rch.dec.com>

The reasons were actually quite simple (correct me if I'm way off base):

[1] Several savesets can be backed up in parallel, but they will always be
restored sequentially, both in spatial (/usr vs. /usr1) and temporal (6
Apr vs. 7 Apr) terms (ie. /usr _at_ 6 Apr, /usr1 _at_ 6 Apr, /usr _at_ 7 Apr,
/usr1 _at_ 7 Apr, sequentially in that order).

[2] Backups can use all tape devices simultaneously, while recovers will
only use one device at a time.

So what can be done to reduce recovery time in the future?

[1] Buy an jukebox, preferably the DLT variety. It won't chop all that
much time off an equivalent autoloader configuration, but it allows the
operator to go home and rest in peace, rather than wait around to change
tapes. Hellebo Knut was particularly insistent about this point, and I'm
grateful to him for that. 8-)

[2] Shrink/split up the file domains. Since the minimum set to restore
in the event of a disk failure (or a multi-disk failure with RAID) is an
entire file domain, the smaller they are, the less you'll have to
restore. In retrospect, an 8GB file domain was a bit much. 8-}

[3] Go RAID. This is personally embarrassing, as our 2100 already has an
entire StorageWorks RAID setup on-board. However, at the time of initial
installation, time and cost factors (mostly time) prompted me to set up
all drives as JBOD. Later on, as we started adding more drives, it became
"it ain't broke, and it's a helluva big task, so why fix it?" Of course,
if I had done it in the first place, I wouldn't be here now. 8-}

As for the parallelism/concurrency issue, reducing both to 1 would
certainly make the tape load sequence very predictable, and it might
reduce restore times by a significant amount, but it definitely would
lengthen backups, perhaps considerably.

All things said, I'll probably go with [1] and [2] above. [3]'s kinda
iffy for our situation -- I'll hafta think about it some more.

Again, thks much for all the help! I certainly learnt a lot more about
NSR than I thought I would.

- Adrian Ho
  adrianho_at_nii.ncb.gov.sg

PS. Two useful resources for NSR users, in case you don't already know:

http://www.legato.com/ -- Web site of Legato Systems (Networker's origin)
  Has lots of info about Networker. Doesn't say anything about DEC's OEM
  version, but the Technical Bulletins on clients and bugs should make
  interesting reading.

networker_at_iphase.com -- Networker for Unix users' mailing list
  To subscribe, send a message to "majordomo_at_iphase.com" with the body
  "subscribe networker"

---------- Begin Included Message ----------
On Sat, 20 Apr 1996, Adrian Ho wrote:
> Hi DU Mgrs!
>
> What sort of values do you use to configure the "Parallelism" and
> "Sessions per device" parameters of your NSR server? We use 8 and 4
> respectively (NSR 3.0a w/ two TZ06 tape drives), and we've been very
> happy with the backup performance.
>
> Full restores, on the other hand, are a catastrophe -- I'm averaging way
> under 1MB/min in restoring 7GB of data from a blown filedomain, mainly coz
> NSR keeps having me switch tapes back and forth seemingly at random during
> the restore. Worse, it keeps coming back to tapes I've loaded at an
> earlier time, so I have to keep on my toes.
>
> I presume that the combination of parallelism and device concurrency
> effectively scattered my backup files across several tapes with no
> apparent order (this server is one of 8 in our cluster). This whole
> process is driving me crazy (to say nothing of blowing my weekend and
> wrecking my sleep pattern -- I could be here for *days*).
>
> Q1: Am I correct in deducing that this abysmal situation is directly
> related to parallelism and device concurrency?
>
> Q2a: If so, will setting both values to 1 for future backups get me
> savesets striped nicely (ie. in sequential order) across my tapes? (At
> this point, I don't give a hoot about the resulting performance hit -- I
> just want to ensure that future full restores don't result in tape
> gymnastics and multi-day sessions.)
>
> Q2b: If not, what can I do to restore some "order" to this backup
> situation, so that I can reduce the need to swap tapes back and forth (and
> going back to tapes for the Nth time)?
>
> Thks much for any info. I'll summarize what I get.
>
> - Adrian Ho
> adrianho_at_nii.ncb.gov.sg
>
Received on Wed Apr 24 1996 - 14:46:21 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:46 NZDT