I discovered today that I had a “ghost” client that was showing up in my Xymon systems monitoring software. I’ve had the luck of not having to deal with this issue in the 5+ years I have been administering and operating Xymon. So today when I started troubleshooting the issue I started to get frustrated because I was unable to fix the issue after 10 minutes of working on it. So some Googling was in order.
A little background on Xymon. Xymon is a systems monitoring tool available on stock Ubuntu installations and has a long and storied history. Originally known as Big Brother, there have been several ports and base code derived GNU acceptable releases released under the names of Hobbit, bbgen toolkit, BBWin, DevMon, hobbit-perl-cl, and of course Xymon. Definitely others but those are the ones that I can quickly recall from memory.
Xymon monitors servers, the services running on those servers, keeping track of cpu usage, available disk space and keeps a historical log of issues as they arrive and are cleared. Thresholds are set so you can be alerted if disk space runs low, or a web server goes dark, or god forbid a server becomes inaccessible. Xymon has robust methods of alerting responsible persons via cell phone calls, SMS/RCS messaging, pager calls (remember those?), emails, just about any modern method of getting a message out, and in the worst cases, an alert might make its way to your boss.
So, a “ghost” client. A ghost client is a client in Xymon which shows on the ghost client page and denotes a server which is sending status messages, but is not technically a server which you have Xymon configured to monitor. In Xymon there is a file known as “/etc/xymon/hosts.cfg.” It works similar to how a “/etc/hosts” file works. Host files map machines names to IP addresses. The /etc/xymon/hosts.cfg performs a similar task but also sets the suite of tests a host is to be monitored for, services or servers you specifically do not want monitored, as well other details such as client nicknames, and other options relating to DNS and client services.
After some Googling I was unable to find a good match that mirrored my specific issue so I spent about 45 minutes reading through posts, papers, and forums that were somewhat related. After gathering the information I spent some time going through the Xymon configuration files, the Xymon server logs, and finally any host based logs which might have been related to the name of the ghost client that was coming up, and in my case, that name was “ubuntu”.
Now the more studious and/or experienced among you will know that “ubuntu” is a hostname applied as a default during new installations of Ubuntu Linux. Could this have something to do with it? Turns out it does, but as my Xymon server has been up over 5+ years with many servers having come and gone, I realized there might be the distinct possibility I might have compounded the problem over the years as new servers were added, old or unused servers were removed, plus the fact that I have a considerable amount of data that has been collected over those 5+ years.
So, armed with all the troubleshooting information I just recently spent time going through on Google, as well as my years of experience with Xymon, a plan started formulating in my mind on just how to approach this issue. Now, as this article doesn’t have a one shot fix all ghost client solution, I will lay out my plan and how I finally resolved the issue.
- Search /etc/hosts and /etc/xymon/hosts.cfg and remove any reference to the ghost server.
- Go through the historical data and make sure any host log files referencing the ghost server are removed and/or backed up. Main thing is to remove from historical logs (generally located at /var/lib/xymon/histlogs/) as this was a recommendation in some of the troubleshooting I read.
- Check network firewall and host based firewall and confirm or remove any references to the ghost client in question.
- Finally, restart/reboot Xymon server and all clients and allow the Xymon standard 5 minute ramp up time to elapse.
- Check to see if issue is resolved.
After going through the above steps, the issue mostly cleared. I did have to manipulate one of my client hosts as when the server came back up, there were a bunch of services that should have been monitored that were missing. This was the final thing I needed to do for everything to come up normally and the solution was simply to comment out the host in question, restart servers and clients, and then uncomment that server with another round of restarting clients and servers, and finally I was back to a fully functioning Xymon front end.
If this is an issue you have encountered and performed a different set of steps or have other solutions, I sure would like to know about them and if you have the time, I would appreciate a small note as a comment below to not only make myself aware, but others who may read this and are dealing with this issue.
I wrote this article because after 10 minutes of initial troubleshooting I was unable to clear the issue and didn’t find an exact match for the problem through Googling. Though I do have a good grasp of the operation of Xymon and the above steps I outlined might have been things I could have done without further research, the research on the issue allowed me formulate a plan and start taking on this seemingly non-trivial issue and in the end there was success.
Thank you for your time and if you have any related issues, or even non-related issues you might need help with, don’t hesitate to leave a comment below.
This article was also published to the troubleshooting site, linuxconnector.com.