Auditing systemd: solving failed units with systemctl

Solving failed units with systemctl

Systemd is an alternative service manager to the more traditional init system. To ensure the system is healthy, failed units should be investigated on a regular basis. Sooner or later a unit might fail and showing up the systemctl listing. In this article we have a look at how to solve it.

Why do services fail?

During the start of the system, enabled services are started and queued to be executed. Most processes will start correctly and systemd logs the related status in the journal. However, in some cases a service might enter a “failed state”, as a result of another command not finishing properly.

# systemctl
UNIT                                     LOAD   ACTIVE SUB       DESCRIPTION
-.mount                                  loaded active mounted   /
boot.mount                               loaded active mounted   /boot
dev-hugepages.mount                      loaded active mounted   Huge Pages File System
 dev-mqueue.mount                       loaded failed failed    POSIX Message Queue File System
run-user-0.mount                         loaded active mounted   /run/user/0
sys-kernel-config.mount                  loaded active mounted   Configuration File System
sys-kernel-debug.mount                   loaded active mounted   Debug File System
tmp.mount                                loaded active mounted   Temporary Directory

Services usually fail because of a missing dependency (e.g. a file or mount point), missing configuration, or incorrect permissions. In this example we see that the dev-mqueue unit with type mount fails. As the type is a mount, the reason is most likely because mounting a particular partition failed.

By using the systemctl status command we can see the details of the dev-mqueue.mount unit:

# systemctl status dev-mqueue.mount
 dev-mqueue.mount - POSIX Message Queue File System
   Loaded: loaded (/usr/lib/systemd/system/dev-mqueue.mount; static)
   Active: failed (Result: exit-code) since Sun 2014-11-23 17:53:10 CET; 4min 12s ago
    Where: /dev/mqueue
     What: mqueue
     Docs: man:mq_overview(7)
           http://www.freedesktop.org/wiki/Software/systemd/APIFileSystems
  Process: 446 ExecMount=/bin/mount -n mqueue /dev/mqueue -t mqueue (code=exited, status=32)

Nov 23 17:53:10 localhost.localdomain systemd[1]: dev-mqueue.mount mount process exited, code=exited status=32
Nov 23 17:53:10 localhost.localdomain systemd[1]: Failed to mount POSIX Message Queue File System.
Nov 23 17:53:10 localhost.localdomain systemd[1]: Unit dev-mqueue.mount entered failed state.

This information shows the related command which was executed. We see the unit failed on exit-code as it was not the expected value of 0 (actually it is 32). Manually running the command shows the device /dev/mqueue is missing.

Similar to this service, IPMI fails on our virtual machine. As there is no /dev/ipmi* device, the service can’t start and fails:

# systemctl status ipmievd.service
? ipmievd.service - Ipmievd Daemon
Loaded: loaded (/usr/lib/systemd/system/ipmievd.service; enabled)
Active: failed (Result: exit-code) since Sun 2014-11-23 16:08:48 CET; 1h 36min ago
Process: 550 ExecStart=/usr/sbin/ipmievd $IPMIEVD_OPTIONS (code=exited, status=1/FAILURE)
Nov 23 16:08:47 localhost.localdomain ipmievd[550]: Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
Nov 23 16:08:47 localhost.localdomain ipmievd[550]: Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
Nov 23 16:08:47 localhost.localdomain ipmievd[550]: ipmievd: using pidfile /var/run/ipmievd.pid0
Nov 23 16:08:47 localhost.localdomain ipmievd[550]: Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory
Nov 23 16:08:47 localhost.localdomain ipmievd[550]: Unable to open interface
Nov 23 16:08:48 localhost.localdomain systemd[1]: ipmievd.service: control process exited, code=exited status=1
Nov 23 16:08:48 localhost.localdomain systemd[1]: Failed to start Ipmievd Daemon.
Nov 23 16:08:48 localhost.localdomain systemd[1]: Unit ipmievd.service entered failed state.
Nov 23 16:08:48 localhost.localdomain systemd[1]: ipmievd.service failed.

Clearing failed units

You can manually clear out failed units with the systemctl reset-failed command. This can be done for all units, or a single one.

Services which are no longer needed, are better to be stopped and disabled.

systemctl stop rngd.service

systemctl disable rngd.service

That’s all!

Learn more about systemctl

This article uses the systemctl command to achieve its tasks. For this popular tool there is a cheat sheet available!

» Mastering the tool: systemctl

systemctl cheat sheet

Feedback

Small picture of Michael Boelen

This article has been written by our Linux security expert Michael Boelen. With focus on creating high-quality articles and relevant examples, he wants to improve the field of Linux security. No more web full of copy-pasted blog posts.

Discovered outdated information or have a question? Share your thoughts. Thanks for your contribution.

Mastodon icon