I run a nightly backup at 2AM of our Fedora Core 1 server
using Mondo. About once every two weeks the system will freeze up. It won’t
process email, present web pages, and I can’t log into it using SSH. The
console screen is blank. We have to cold start server.
The easiest suggestion is to stop using Mondo, but I like
the fast recovery Mondo offers.
So we’re trying to figure out how to diagnose this
problem and where to start. Memory, Hard Drive (is new), IDE
controller, etc?
Does anyone have any ideas where to start?
Thanks,
--Bill Wesson
This is my script to run Mondo daily at 2AM:
mkdir -p /home/mondo/`date +%A`
mondoarchive -Oi -d /home/mondo/`date +%A` -E "/home/mondo"
Log files don’t provide any clues.
CRON log for Sunday-2AM
Oct 24 01:50:00 payson CROND[24793]: (root) CMD (/usr/local/bin/weblogs)
Oct 24 02:00:00 payson CROND[25052]: (root) CMD (run-parts /etc/cron.daily-2am)
Oct 24 02:00:00 payson CROND[25056]: (root) CMD (nice --adjustment=15 /usr/local/sbin/update_site_summary_cache)
Oct 24 02:00:00 payson CROND[25054]: (root) CMD (/usr/local/bin/weblogs)
Oct 24 02:01:00 payson CROND[25961]: (root) CMD (run-parts /etc/cron.hourly)
Oct 24 02:10:00 payson CROND[7866]: (root) CMD (/usr/local/bin/weblogs)
CRON log for Monday-2AM
Oct 25 01:50:00 payson CROND[29959]: (root) CMD (/usr/local/bin/weblogs)
Oct 25 02:00:00 payson CROND[30059]: (root) CMD (run-parts /etc/cron.daily-2am)
Oct 25 02:00:00 payson CROND[30063]: (root) CMD (nice --adjustment=15 /usr/local/sbin/update_site_summary_cache)
Oct 25 02:00:00 payson CROND[30061]: (root) CMD (/usr/local/bin/weblogs)
Oct 25 02:01:00 payson CROND[30953]: (root) CMD (run-parts /etc/cron.hourly)
Oct 25 07:28:59 payson crond[2982]:
(CRON) STARTUP (fork ok)
MESSAGES log for Monday-2AM
Oct 25 01:50:08 payson logger: weblogs: (29959) done.
Oct 25 02:00:00 payson logger: weblogs: (30061) starting.
Oct 25 02:00:01 payson autofs: automount shutdown
succeeded
Oct 25 02:00:09 payson logger: weblogs: (30061) done.
Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' (in 'dogg.c
om'?): 68.157.222.155#53
Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' (in 'dogg.c
om'?): 207.65.0.25#53
Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' (in 'dogg.c
om'?): 68.157.222.155#53
Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' (in 'dogg.c
om'?): 207.65.0.25#53
Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' (in 'dogg.c
om'?): 68.157.222.155#53
Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' (in 'dogg.c
om'?): 207.65.0.25#53
Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' (in 'dogg.c
om'?): 68.157.222.155#53
Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' (in 'dogg.c
om'?): 207.65.0.25#53
Oct 25 07:27:44 payson syslogd 1.4.1: restart.
1AM - Monday
total used free shared buffers cached
Mem:
773208
713100
60108
0
66540
522432
-/+ buffers/cache: 124128 649080
Swap: 2048248 90048 1958200
2AM - Monday
total used free shared buffers cached
Mem:
773208
768768 4440
0
102908
491276
-/+ buffers/cache: 174584 598624
Swap: 2048248 90048 1958200
Thanks,
Bill Wesson, Network Administrator
Vision Engraving Systems
http://www.visionengravers.com
602-439-0600