Hello,
I have a grab bag of questions related to NIS, NFS, and 3.2c issues. I
have two AlphaServer 2100 (4/200, 4/275), both running Digital UNIX
3.2c.
1. 3.2c vdump of root attempts to backup /proc. Why? I restored a 3.2 vdump
and received the same results. I never noticed this prior to 3.2c. Is
this a proc problem?
path : /
dev/fset : root_domain#root
type : advfs
advfs id : 0x2e5e15a7.000ecda0.1
vdump: Date of last level 0 dump: the start of the epoch
vdump: Dumping directories
vdump: unable to get info for file <./proc/02610>; [2] No such file or directory
vdump: unable to get info for file <./proc/02610>; [2] No such file or directory
vdump: Dumping 5202746737 bytes, 113 directories, 20847 files
vdump: Dumping regular files
vdump: unable to read file <./proc/00000>; [22] Invalid argument
vdump: unable to read file <./proc/00001>; [22] Invalid argument
vdump: unable to read file <./proc/00003>; [22] Invalid argument
... for pages of errors.
2. One of the reasons I upgraded to 3.2c is for the DE500 (PCI Ethernet
10/100 network card) support. Unfortunately, the update process does not
supply me with a genvmunix kernel -- I had to install LSM/ATM subsets,
build the kernel, then remove the unwanted kernels. This process
took some time to determine the required subsets. Does anyone
know how DEC decides which releases will have a genvmunix? I recommend
a genvmunix always be supplied since new hardware is normally introduced.
3. One of my 2100's has the system disk RAID 0+1 via the SWXCR-EB
(aka KZESC-BA). I noticed if I run doconfig, reboot the system, then
/sys/MASON/vmunix becomes corrupted. Note that the SWXCR board has
battery disabled and write-thru cache. The RAID device is 8 RZ28-VA's
with AdvFS for the filesystem. Here is my evidence...
TEST /sys/MASON2/vmunix corruption.
BEFORE BOOT:
what /sys/MASON2/vmunix|wc -> 520 5675 43075
sum /sys/MASON2/vmunix -> 30520 7334 /sys/MASON2/vmunix
ls -l /sys/MASON2/vmunix ->
-rwxr-xr-x 3 root system 7509616 Aug 26 15:45 /sys/MASON2/vmunix
AFTER BOOT:
what /sys/MASON2/vmunix|wc -> 182 1978 14995
sum /sys/MASON2/vmunix -> 29430 7334 /sys/MASON2/vmunix
ls -l /sys/MASON2/vmunix ->
-rwxr-xr-x 3 root system 7509616 Aug 26 15:45 /sys/MASON2/vmunix
Note, I get the same results if I 'shutdown -h' or 'shutdown -r'. SWXCR
firmware is 2.16, 2100 firmware is SRM 4.1, I use ECU 1.8 and RCU 3.11.
I discovered this because /,/var,/usr are on the same filesystem
and I USED to use 'mv' to copy /sys/MASON2/vmunix to /vmunix. Now I
must copy 'cp' to get a good copy before the reboot. With the
copy, /vmunix is ok. But, /sys/MASON2/[vmunix,vmunix.OSF1, vmunix.swap]
become corrupted after the reboot.
4. I use the above mentioned DE500 network card as a private (autobahn)
network to have the two 2100's NFS disks. I mount /pub, /var/spool/mail,
user space, and apps with rw,hard,intr. I also do NOT mount the
NFS filesystem in root and use symbolic links for things like /pub.
That is, I NFS mount /pub on /nfs/pub and have a symbolic link /pub pointing
to /nfs/pub on the NFS client. The NFS client system has its
local /var, /, /usr, swap (system space). I notice the client NFS
system hangs while the NFS server is unavailable. How do I determine the
process that hangs the system and kill it? In general, how do I determine
which processes are hung on NFS? If intr gives the chance to kill/interrupt
a NFS related operation, I need to know which one to kill.
5. Anyone notice that 3.2c replaces 'pseudo-device rpty nnn' with
'OPTION RPTY' in the kernel? I had rpty set to 512 because I have
upto 400 concurrent sessions. I think the default was 255. Under 3.2c
it appears the limit is 255, again. My users received 'all network ports
in use' after about 255, or so, sessions. How do I change this? I took
a guess and modified the running kernel, via kdbx, with nptys=512. Is this
correct? How do I specify this in the kernel? The 3.2c BOOKREADER documents
don't talk about RPTY, nor nptys. YIKES!
6. I have 16,000+ users on the system and currently have 5 RZ28 mounted
as /usr/u1, /usr/u2,..., /usr/u5 which contain their home areas. I
plan to RAID 0+1 several RZ29's and make one large /usr/home area. Will
I have a performance hit since directory /usr/home will be large? Is
it better to have a RAID 0+1 logical drive partitioned into several, say
5 partitions, instead of one large RAID 0+1 device and mount the home
area into the /usr/u1,..., /usr/u5 mount points? The main
question is if one large directory would be a bottleneck? Of course,
I assume a partitioned RAID 0+1 device does not spread (i.e., strip)
the load as well as one large RAID 0+1 device. So, it appears I may
have a trade off.
7. This question may be related to #6. I have noticed 'pwd' in /usr/u1/jblow
takes about 10 seconds to complete on the NFS client. However, 'pwd'
in /, /usr, /usr/u1 (/usr/u1 is NFS mounted), takes the usual fraction
of a second. I have also noticed 'pwd' on the NFS server in /usr/u1/jblow
takes a fraction of a second. Is the problem a large directory that
is NFS exported? Note that 'ls' in /usr/u1/jblow, on the NFS client
and server, take only a fraction of a second to complete. Where is
the bottleneck? Note: /usr/u1 has 3600+ entries (user directories).
8. Anyone using NIS/C2 in a large user base (5000+)? If so, how
much of a performance hit did you take with NIS on? I currently only have
C2 enabled but want to use NIS in our pseudo cluster environment (RAID,
DECsafe, HSZ40, NFS, NIS, home grown software, etc) -- at
least until DEC provides that capability. If anyone at DEC is reading
this, don't let the engineers working on this take sick and vacation time.
However, permit weekly conjugal visits. ;-)
Thank you for your consideration, time, and patience.
--
Regards,
Richard Jackson George Mason University
Computer Systems Engineer UCIS / ISO
Computer Systems Engineering
Received on Mon Aug 28 1995 - 22:29:23 NZST