----------------------------------------------------------------------- MoSSHe v10.6.22 2003- by Volker Tanger ----------------------------------------------------------------------- MoSSHe (MOnitoring in Simple SHell Environment) is a simple, lightweight (both in size and system requirements) server monitoring package designed for secure and in-depth monitoring of a handful of typical/critical internet systems. For example in a current setup I do 219 checks on 2 servers monitoring 6 systems - on average 36 checks for each system. MoSSHe supports email alerts and SLA monitoring out of the box - and whatever you can script. The system is programmed in plain (Bourne) SH, and to be compatible with BASH and Busybox so it can easily be deployed on embedded systems. Monitoring is designed to be distributed over multiple systems, usually running locally. As no parameters are accepted from outside, checks cannot be tampered or misused from outside. The system is designed to allow decentralized checks and evaluation as well as classical agent-based checks with centralized data accumulation. Agent data is transferred via HTTP, so available web servers can be co-used for agent data transfer. Additionally each agent creates simple (static) HTML pages with full and condensed status reports on each system, allowing simple local checks. Requirements for MoSSHe: * Unix Shell (Bourne-SH, BASH, Busybox) * standard Unix text tools (fgrep, cut, head, mail, time, date, ...) * "netcat" networking tool for single checks only if performed: * "dig" for DNS check * "free" memory display for memory check * "lpq" BSD(compatible) printing for printing check * "lynx" web browser for HTTP check * "mailq" if running the mail queue check * "mbmon" motherboard check for temp/fan check * "smbclient" for samba check * "snmp" networking tools (especiall "snmpget") for SNMP check * /proc/mdstat for Linux MD0 SoftRAID checks for web interface: * webserver Hardware requirements: A difficult question. As the checks are run and evaluated locally on each system it is nearly impossible to "overload" the server as is with other monitoring systems. The system is a shell script, so no big size components here, either. For a webserver any HTTPD is fine. No database needed - everything is plain text. KNOWN ISSUES: - (please tell me) FEATURE WISH LIST / ROADMAP: * RRD graphs for selected/all checks - for trend analysis * local checks: - check number of users (via "w" command) * network checks: - MySQL checks * alerting: - IM alerts via jabber * webviews: - RRD graphs - evaluation of SLA reports, giving total downtimes and average availability - grouping of servers (for large environments) Updates will be available at http://www.wyae.de/software/mosshe/ Please check there for updates prior to submitting patches! There is a user/developer mailing list available. To subscribe send a mail with "subscribe mosshe" as subject to minimalist@wyae.de For bug reports and suggestions or if you just want to talk to me please contact me at volker.tanger@wyae.de ----------------------------------------------------------------------- Monitoring server Setup ----------------------------------------------------------------------- Get and unzip the archive - usually in /usr/local/lib/mosshe. Edit the MOSSHE file and set the environment MYNAME name of this server WWWDIR where the HTML reports and status file are saved to DATADIR location of MOSSHE scripts (/usr/local/lib/mosshe) TEMPDIR for temporary files (default: /tmp) In the MOSSHE shells cript file you now can configure the checks to be run - usually you can set warning and alert trigger levels #========================================================= # Local Checks #========================================================= HDCheck minimum free space on a filesystem LoadCheck maximum load of a system MemCheck minimum free RAM ProcessCheck maximum processes running ZombieCheck maximum zombie processes ShellCheck maximum shells for root / other users NetworkErrorsCheck percentage of errors on interface NetworkTrafficCheck maximum kbit/s network throughput FileCheck check file existing (check PIDs or named pipes) ProcCheck check for process existing MailqCheck maximum number of mails in queue PrintCheck maximum number of print jobs in queue MBMonCheck Motherboard-checks: maximum temperature, fan speeds RaidCheck checks md0 RAID (WARN=syncing, ALERT=fail) LogEntryCheck maximum number of message matches in logfiles (used to check for bruteforcing, see examples in MOSSHE) CheckFileChanges compare current file to known-good copy CheckConfigChanges compare config (command) to known-good copy #========================================================= # Network Checks #========================================================= PingPartner maximum ping loss and avg. roundtrip SAMBA checks for Microsoft file server (SMB/CIFS/Samba) HTTPheader http server return code HTTPheadermatch checks for named return code (usually 302-Moved) HTTPcontentmatch check for web site content FTPcheck checks for FTP service POP3check checks for POP3 service IMAPcheck checks for IMAP service SMTPcheck checks for SMTP mail service RBLcheckIP checks whether an IP address is listed on RBL RBLcheckFQDN checks whether a named system is listed on RBL DNSquery checks whether a DNS response is given DNSmatch checks a DNS response against expected value #========================================================= # Import agent data from other servers #========================================================= With this function you can establish one or multiple central servers by including the data from other MoSShE agents into the local one. Just be careful not to do circular inclusions or your logfile size might explode! ImportAgent URL to the index.csv file, which can include username and password as in http://user:passwd@remote.server.test/mosshe/index.csv #========================================================= # Alerting, Logging #========================================================= LogTo write full log to given filename LogToDaily as above, but writes to file with date appended AlertMailAlways send alert whenever a service IS down AlertMailOnChange send alert mail only if something changed (whenever a service GOES up or down) AlertMailOnChangeFor as above, but with pattern matching e.g. server name to distinguish e.g. between admins SLA_Eval builds log extract for downtime documentation ----------------------------------------------------------------------- Distributed monitoring via clients (monitored system) Setup ----------------------------------------------------------------------- It's easy: Select one system to be your central server. Import monitoring agent data from the "slave" servers using the ImportAgent command (see above) in its mosshe file. Done. ;-) ----------------------------------------------------------------------- Usage ----------------------------------------------------------------------- Adapt the "mosshe" script. Place the CRON.D_MOSSHE file into /etc/cron.d or adapt it accordingly so mosshe is called periodically. Via the web interface you can view the overall status - full and abbreviated status. But you cannot modify anything - which makes it quite safe for even non-admin multiuser use... ;-) ----------------------------------------------------------------------- Known/common Problems and Maintenance ----------------------------------------------------------------------- (none yet) ----------------------------------------------------------------------- Customizing Checks & Writing your own ----------------------------------------------------------------------- Writing your wown: A check must terminate within a given (short) timeframe regardless circumstances - so make sure there are timeouts builtin or configured. If not, your complete MoSSHe might hang when this check stops. Scripts (better: shell funcrions) must write a status line to $TEMPDIR/tmp.$$.collected.tmp A check *must* give back the results (and only these) to STDIO in the form shown below. And you are allowed for ONE LINE PER STATUS ONLY. A check script can test a(ny) number of stati, though. The SSH check for example usually tests about a dozen parameters in one command. date ; time ; systemname ; status ; numeric ; long DATE in ISO format: yyyy-mm-dd with yyyy = 4digit year, mm=2digit month, dd=2digit day TIME HH:MM:SS - 24hour time, all 2digit this is the time local to MoSSHe server for all PING and service checks, but local time of the server checked when using imported checks SYSTEMNAME Host name, FQDN or IP address of the system as configured in mosshe CHECK (short) name of the check. STATUS any status of: OK, WARN, ALERT, UNDEF NUMERIC the numeric value of the test, e.g. LOAD number, free megabytes, etc. It must be a valid FLOAT or INT number to be displayed nicely. LONG A longer text with details to the status. Should be short enough to fit into one line of the web display for nicer display, though. Here an example of the output of a number of checks - the first 6 checks after PING are all from a single LOCALCHECK script, btw. 2004-07-23;23:55:32;kali;ping;OK;1;host up 2004-07-23;23:55:32;kali;/dev/hda1;OK;4054;Disk free 2004-07-23;23:55:32;kali;/dev/hda2;OK;1395;Disk free 2004-07-23;23:55:32;kali;/dev/hdb3;OK;2817;Disk free 2004-07-23;23:55:32;kali;load;OK;0.80;Load: 0.80 2004-07-23;23:55:32;kali;processes;OK;76;Total processes: 76 2004-07-23;23:55:32;kali;zombies;OK;0;Zombie processes: 0 = ok 2004-07-23;23:55:34;hermes;ping;OK;1;host up Please keep in mind that MoSSHe is designed to be lean, small, efficient. Thus having to install a JSP/EJB server only to install one singular check usually is not considered overly adequate. Small, simple, secure - that's the way we should go. If you have a nice (free) check that could be of use to other people, please send it to me so I can include it into the distribution. ----------------------------------------------------------------------- Shortcut: Distributable under GPL ----------------------------------------------------------------------- Copyright (C) 2003- Volker Tanger This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. or on their website http://www.gnu.org/copyleft/gpl.html -----------------------------------------------------------------------