Nms current design: Difference between revisions
Line 3: | Line 3: | ||
* Frontend + Backend + Daemons running on a 2 core, 2 Gig RAM, 100GB Drive KVM Guest | * Frontend + Backend + Daemons running on a 2 core, 2 Gig RAM, 100GB Drive KVM Guest | ||
* as of 07-15-23, ~80 hosts, and several hundred checks end up with 300MB RAM use with little to no IO wait | * as of 07-15-23, ~80 hosts, and several hundred checks end up with 300MB RAM use with little to no IO wait | ||
* 4.5% userland and | * 4.5% userland and 3.5% system in use | ||
== Backend == | == Backend == |
Revision as of 13:07, 14 July 2023
Design Information
All systems are deliberately being designed on UNDER POWERED machines.
- Frontend + Backend + Daemons running on a 2 core, 2 Gig RAM, 100GB Drive KVM Guest
- as of 07-15-23, ~80 hosts, and several hundred checks end up with 300MB RAM use with little to no IO wait
- 4.5% userland and 3.5% system in use
Backend
The backend is PHP 7.X using the Slim4 framework. Composer is used for additional functionality, however most of the installed packages are not actually being used. These are going to have to be removed and get down to the core packages needed before posting the code publicly.
- This is going to be migrated to PHP 8.3 as soon as reasonable to test
Frontend
Frontend is Bootstrap5 and PHP7.x No database connections will EVER be allowed from the frontend. Anything to do with the DB must use an API.
Database
Currently using MySQL 5.7. This is a deliberate choice follow a KISS principle where there are not new features in on version making it mandatory to upgrade to version X every time there is an upgrade. It also makes it easier to transfer between MySQL, MariaDB, and Postgres, or really ANY database that follows a SQL format. Old Ubuntu 12.10 host 4core, 1GB RAM, 40GB Drive, KVM Guest (multiple other services running on this as well)
Metrics
- RRD stored locally on the API server. API will generate and display the images for the frontend to consume. Raw data is NEVER sent to the frontend for rendering unless there is no other option. JavaScript will be the death of me. I suck at it so badly, and really it seems to make the site SO MUCH SLOWER to have client side work done for complex things.
- Graphite is running on 4core, 4GB RAM, 100GB drive KVM Guest with other apps running on it. Initially my thought was to use Graphite as the main graph engine, but thinking about small business, as well as being able to do future predictions I felt RRD was the best way to start this, and have Graphite supported but not the default graph engine.
- The graphite support I have in place currently is going to have to be redone in the "template"style so that adhoc stuff can be more easily added in the future. right now it is a series of regexes and database hacks that I am not happy with.
Templates
These all live on the API host within the templates directory. These are going to have to be expanded much further. Additionally the "templates" table is going to have to be leveraged more going forward. Currently templates exist to push metric data to rrd, graphite, and the database. Additional templates are in place for rendering RRD files. I prefer this route so that we can change templates easier going forward.
Daemons
Daemon controls have all been written in PHP. This likely is not the best language to write a daemon in, but I wanted consistency with what the rest of the application is written in. If I ever get additional support, I will likely have this revamped to Python.
SNMPTRAP
The snmptrapd system forwards all SNMP traps to a PHP script which parses and forwards events into the system after being processed.
NRPE
Leveraging the Nagios Plugin system for adhoc or external checks to add additional functionality. Currently supports a basic metric ingestion into RRD, but does not have a good rendering tie-in yet.
SNMP
Currently using the net-snmp package via shell exec. I wanted more control over how SNMP responded than I could get with SNMP Classes or composer installed support applications. Also I FIRMLY believe if a machine is doing something a human MUST be able to duplicate the behavior. Using shell commands makes this easier to work with.