On Wed, 2 Jan 2002, Robert Davies wrote:
> serial console would help
<quote type="enemyOfTheState">It's already done</quote>
> especially if you can log the panic's into a
> logfile on another machine linked by serial cable.
That's generally the idea. What's more important though is that these
things should happen at night when they won't affect normal activities.
I'm in the office now as it happens, running a few tests, etc.
> I've seen similar sounding problems, when a developer played with modperl on
> a production web server (I was away on holiday)
This is also running apache with a few modules like modperl however the
point is, it's a vm bug isn't it. I'm (excuse language) fucking pissed off
that they haven't sorted this out by now, sure I'm hardly one to talk
because you don't see me doing it but someone should have done it :-)
> basically the Linux VM folds under really heavy abuse
This I knew. I've turned off swap for tonight's tests to see if that
helps. Problem is getting more than 1.5GB of RAM into a machine which
can't take any more than its already got :)
> which is why they consider a VM similar to FreeBSD's in 2.5/2.6.
Or just like 2.2. I can run said Java program and let it allocate numerous
GB of memory (said box has 1.5GB RAM and 14.5GB swap - yes I know that's
a little excessive) on its new 2.2.20 based production server just fine.
It doesn't need to run fast as it's only handling dishing out processing
blocks and piecing them back together but when we do a run of the Notting
Hill area then it's nice to actually processes all the map data :)
> On those 3Com cards, I've seen problems in past
As have I, discounted as you say, because it was still responsive
initially and also said card was not likely to be under too much strain.
> where an interface drops out, and needs an ifconfig down & up
...like my old firewall in Hall last year, though that was eventually
partially attributed to Nottingam having given me a fucked port...
> But it doesn't sound like that one, from your ssh response.
It's not, IMO.
> Perhaps you'ld consider using a remote logging server
That's "in progress".
> and have syslog over net, to get access to more info?
Yes however this often happens very quickly so I seriously doubt many
useful syslogs from it :(
> You could perhaps put in some cron jobs, doing things like ps auxwww to
> a log file every minute, and or use logger(1) in your places scripts to
> record what's going on when the problems are triggered.
I had process accounting running and a few others, and there is a watchdog
too but these all don't really solve the problem :)
> Have fun, but you know what you say to your luser's when they report
> problems, without copy/pasting error messages or showing you log files :)
Yeah yeah. No useful logs and as yet no sucessfully captured panic()s.
--jcm
--------------------------------------------------------------------
http://www.lug.org.uk http://www.linuxportal.co.uk
http://www.linuxjob.co.uk http://www.linuxshop.co.uk
--------------------------------------------------------------------
This archive was generated by hypermail 2.1.3 : Thu 03 Jan 2002 - 01:03:28 GMT