WYAE - MoSSHe List Archive
| Re: mosshe 1.2.1
|
| From: | Volker Tanger |
| Date: | Wed, 10 Nov 2004 19:39:32 +0100
|
Greetings!
On Wed, 10 Nov 2004 09:25:51 -0300
Eduardo Grosclaude wrote:
> > > edited this to "fgrep -v grep" and then had to get rid of
> > > "restarted" messages by commenting out the mail sentence.
> >
> > Well, occasional "restarted" messages (whenever MOSSHE failed) are
> > intended to give an admin an indication that there has been
> > something wrong with MOSSHE so he can investigate. Or did I
> > missunderstand something in your comment?
> I was looking at it in a very different way. I presume the regular
> thing should be that the system is at rest whenever cron_mosshe is
> called to run.
No. The mosshe script itself contains an endless loop that runs through
all servers, waits the configured time (5 minutes by default) and then
starts over. This way you have no fixed turnus schedule, but a varying
one, growing larger with each added server. On the other hand the MoSSHe
server cannot be overloaded this way as only one process/check is
running at any given time.
The cron_mosshe script only checks wether this main process (still) is
running. If so, it exits without doing anything - but if not (i.e.
MoSSHe broke doen) it restarts the main process (loop) and issues the
"was restarted" mail (which is indicating a problem).
> Another point I had forgotten. mosshe spawns from mosshe's own home
> but webinterface (with Apache running as user apache) needs at least
> r-x permissions on mosshe's home. Being somewhat sloppy on the
> security side, I just enabled r-x rights to everybody on /home/mosshe.
> Another way, just only slighly more restrictive, would be chgrp apache
> /home/mosshe; chmod g+rx /home/mosshe. This is anyway a departure from
> the "safer" 0700-mode home directory. Any other solution, short of
> enabling "~user"-like directories on Apache?
Rrrright, something to document. I'm running the webinterface in a
different directory (e.g. /var/www/mosshewebinterface/), so not overly
many security problems with respect to direct web access.
You are right with respect to access rights - but I assumed my setup
where the MoSSHe system is running on a server without interactive
users. So access rights are not as critical...
> > > 4. An error
> > > related to $SCRIPT variable appears at the end of manual mosshe
> > > execution (probably because I copied Alerts/* to a new home, which
> > > leaves out a dot-file I hadn't noticed before? How about some
> > > explanation about these hidden files in docs)
> > Sorry, can't reproduce this. The only place I use $SCRIPT is the
> > mail MOSSHE script. Anyway moving the CHECKS directory will probably
> > break the system at more than one point... :-(
> I'm still getting this :
>
> ------------------------------------------
> Evaluating findings - sending alerts:
> ./mosshe: line 106: `$SCRIPT': not a valid identifier
Huh? I'll dig into this.
Thanks for developing the MBMon interface!
Bye
Volker
PS - I CC:ed this mail to the mosshe@wyae.de mailing list.
> ------------------------------------------------------------------
> OK, this seems to be working! :) I am using an amazing little program
> called xmbmon/mbmon (http://freshmeat.net/projects/xmbmon) which comes
> with lots of hot facts about hardware.
>
>
> Setup on monitor
> 1. Edit servers.conf to perform a ssh check on your clients.
>
> Setup on clients
> 1. Download mbmon, READ THE DOCUMENTATION, this program can be
> dangerous to your hardware. Build, install as recommended. The
> information returned by mbmon means different things depending on your
> mobo and the way it is installed. You must understand your hardware
> setup and customize this script. Mbmon should be installed suid root.
>
> 2. localcheck should look like this:
>
> # MBMonCheck PARAM warnvalue alertvalue comment
> # PARAM is [TEMP0|TEMP1|TEMP2|FAN0|FAN1|FAN2|VC0|...]
> # warnvalue, alertvalue MUST be same type as value
> # i.e. decimal point for temps, integer for fan speed...
> MBMonCheck TEMP0 36.0 40.0 "Inlet temp"
> MBMonCheck TEMP1 40.0 46.0 "CPU temp"
> MBMonCheck FAN1 1300 1300 "CPU FAN"
>
> 3. localcheck.functions should include:
>
> #---------------------------------------------------------
> # Check for CPU temp / fan speed / MB voltage...
> # Uses mbmon program (http://freshmeat.net/projects/xmbmon)
> #---------------------------------------------------------
> function MBMonCheck () {
> PARAM=$1 # TEMP0, TEMP1...
> SWARNVAL=$2 # 38.0
> SALERTVAL=$3 # 40.0
> DESCRIP=$4 # "CPU temp"
>
>
> SVALUE=`/usr/local/bin/mbmon -c 1 -r | grep "$PARAM" | sed -e
> "s/$PARAM *: *//"`
> if expr index $SVALUE \. ; then
> typeset -i VALUE=${SVALUE/./}
> typeset -i ALERTVAL=${SALERTVAL/./}
> typeset -i WARNVAL=${SWARNVAL/./}
> else
> typeset -i VALUE=${SVALUE}
> typeset -i ALERTVAL=$SALERTVAL
> typeset -i WARNVAL=$SWARNVAL
> fi
>
>
> if [ "$VALUE" -gt $ALERTVAL ]; then
> STATUS="ALERT"
> MESSAGE="$DESCRIP: $SVALUE exceeds ALERT value
> $SALERTVAL"
> elif [ "$VALUE" -gt $WARNVAL ]; then
> STATUS="WARN"
> MESSAGE="$DESCRIP: $SVALUE exceeds WARN value
> $SWARNVAL"
> else
> STATUS="OK"
> MESSAGE="$DESCRIP: $SVALUE in range"
> fi
> echo "${DATIM};$MYNAME;$PARAM;$STATUS;$SVALUE;$MESSAGE"
> }
>
> # end of MBMonCheck
> #-------------------------------------------------------
--
Volker Tanger volker.tanger@wyae.de
-===================================-
Research & Development Division, WYAE
--- StripMime Report --
Plain text mail. Excellent! Won't be converted or stripped.
---