Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 1 | =========================== |
| 2 | Using Systemd in DevStack |
| 3 | =========================== |
| 4 | |
| 5 | .. note:: |
| 6 | |
| 7 | This is an in progress document as we work out the way forward here |
| 8 | with DevStack and systemd. |
| 9 | |
| 10 | DevStack can be run with all the services as systemd unit |
| 11 | files. Systemd is now the default init system for nearly every Linux |
| 12 | distro, and systemd encodes and solves many of the problems related to |
| 13 | poorly running processes. |
| 14 | |
| 15 | Why this instead of screen? |
| 16 | =========================== |
| 17 | |
| 18 | The screen model for DevStack was invented when the number of services |
| 19 | that a DevStack user was going to run was typically < 10. This made |
| 20 | screen hot keys to jump around very easy. However, the landscape has |
| 21 | changed (not all services are stoppable in screen as some are under |
| 22 | Apache, there are typically at least 20 items) |
| 23 | |
| 24 | There is also a common developer workflow of changing code in more |
| 25 | than one service, and needing to restart a bunch of services for that |
| 26 | to take effect. |
| 27 | |
| 28 | To enable this add the following to your local.conf:: |
| 29 | |
| 30 | USE_SYSTEMD=True |
| 31 | |
| 32 | |
| 33 | |
| 34 | Unit Structure |
| 35 | ============== |
| 36 | |
| 37 | .. note:: |
| 38 | |
| 39 | Originally we actually wanted to do this as user units, however |
| 40 | there are issues with running this under non interactive |
| 41 | shells. For now, we'll be running as system units. Some user unit |
| 42 | code is left in place in case we can switch back later. |
| 43 | |
| 44 | All DevStack user units are created as a part of the DevStack slice |
| 45 | given the name ``devstack@$servicename.service``. This lets us do |
| 46 | certain operations at the slice level. |
| 47 | |
| 48 | Manipulating Units |
| 49 | ================== |
| 50 | |
| 51 | Assuming the unit ``n-cpu`` to make the examples more clear. |
| 52 | |
| 53 | Enable a unit (allows it to be started):: |
| 54 | |
| 55 | sudo systemctl enable devstack@n-cpu.service |
| 56 | |
| 57 | Disable a unit:: |
| 58 | |
| 59 | sudo systemctl disable devstack@n-cpu.service |
| 60 | |
| 61 | Start a unit:: |
| 62 | |
| 63 | sudo systemctl start devstack@n-cpu.service |
| 64 | |
| 65 | Stop a unit:: |
| 66 | |
| 67 | sudo systemctl stop devstack@n-cpu.service |
| 68 | |
| 69 | Restart a unit:: |
| 70 | |
| 71 | sudo systemctl restart devstack@n-cpu.service |
| 72 | |
| 73 | See status of a unit:: |
| 74 | |
| 75 | sudo systemctl status devstack@n-cpu.service |
| 76 | |
Sean Dague | def07b2 | 2017-03-30 07:18:49 -0400 | [diff] [blame^] | 77 | Operating on more than one unit at a time |
| 78 | ----------------------------------------- |
| 79 | |
| 80 | Systemd supports wildcarding for unit operations. To restart every |
| 81 | service in devstack you can do that following:: |
| 82 | |
| 83 | sudo systemctl restart devstack@* |
| 84 | |
| 85 | Or to see the status of all Nova processes you can do:: |
| 86 | |
| 87 | sudo systemctl status devstack@n-* |
| 88 | |
| 89 | We'll eventually make the unit names a bit more meaningful so that |
| 90 | it's easier to understand what you are restarting. |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 91 | |
| 92 | Querying Logs |
| 93 | ============= |
| 94 | |
| 95 | One of the other major things that comes with systemd is journald, a |
| 96 | consolidated way to access logs (including querying through structured |
| 97 | metadata). This is accessed by the user via ``journalctl`` command. |
| 98 | |
| 99 | |
| 100 | Logs can be accessed through ``journalctl``. journalctl has powerful |
| 101 | query facilities. We'll start with some common options. |
| 102 | |
| 103 | Follow logs for a specific service:: |
| 104 | |
| 105 | journalctl -f --unit devstack@n-cpu.service |
| 106 | |
| 107 | Following logs for multiple services simultaneously:: |
| 108 | |
Sean Dague | def07b2 | 2017-03-30 07:18:49 -0400 | [diff] [blame^] | 109 | journalctl -f --unit devstack@n-cpu.service --unit |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 110 | devstack@n-cond.service |
| 111 | |
Sean Dague | def07b2 | 2017-03-30 07:18:49 -0400 | [diff] [blame^] | 112 | or you can even do wild cards to follow all the nova services:: |
| 113 | |
| 114 | journalctl -f --unit devstack@n-* |
| 115 | |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 116 | Use higher precision time stamps:: |
| 117 | |
| 118 | journalctl -f -o short-precise --unit devstack@n-cpu.service |
| 119 | |
| 120 | |
| 121 | Known Issues |
| 122 | ============ |
| 123 | |
| 124 | Be careful about systemd python libraries. There are 3 of them on |
| 125 | pypi, and they are all very different. They unfortunately all install |
| 126 | into the ``systemd`` namespace, which can cause some issues. |
| 127 | |
| 128 | - ``systemd-python`` - this is the upstream maintained library, it has |
| 129 | a version number like systemd itself (currently ``233``). This is |
| 130 | the one you want. |
| 131 | - ``systemd`` - a python 3 only library, not what you want. |
| 132 | - ``python-systemd`` - another library you don't want. Installing it |
| 133 | on a system will break ansible's ability to run. |
| 134 | |
| 135 | |
| 136 | If we were using user units, the ``[Service]`` - ``Group=`` parameter |
| 137 | doesn't seem to work with user units, even though the documentation |
| 138 | says that it should. This means that we will need to do an explicit |
| 139 | ``/usr/bin/sg``. This has the downside of making the SYSLOG_IDENTIFIER |
| 140 | be ``sg``. We can explicitly set that with ``SyslogIdentifier=``, but |
| 141 | it's really unfortunate that we're going to need this work |
| 142 | around. This is currently not a problem because we're only using |
| 143 | system units. |
| 144 | |
| 145 | Future Work |
| 146 | =========== |
| 147 | |
| 148 | oslo.log journald |
| 149 | ----------------- |
| 150 | |
| 151 | Journald has an extremely rich mechanism for direct logging including |
| 152 | structured metadata. We should enhance oslo.log to take advantage of |
| 153 | that. It would let us do things like:: |
| 154 | |
| 155 | journalctl REQUEST_ID=...... |
| 156 | |
| 157 | journalctl INSTANCE_ID=...... |
| 158 | |
Sean Dague | def07b2 | 2017-03-30 07:18:49 -0400 | [diff] [blame^] | 159 | And get all lines related to the request id or instance id. (Note: |
| 160 | this work has been started at https://review.openstack.org/#/c/451525/) |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 161 | |
| 162 | log colorizing |
| 163 | -------------- |
| 164 | |
| 165 | We lose log colorization through this process. We might want to build |
| 166 | a custom colorizer that we could run journalctl output through |
| 167 | optionally for people. |
| 168 | |
| 169 | user units |
| 170 | ---------- |
| 171 | |
| 172 | It would be great if we could do services as user units, so that there |
| 173 | is a clear separation of code being run as not root, to ensure running |
| 174 | as root never accidentally gets baked in as an assumption to |
| 175 | services. However, user units interact poorly with devstack-gate and |
| 176 | the way that commands are run as users with ansible and su. |
| 177 | |
| 178 | Maybe someday we can figure that out. |
| 179 | |
| 180 | References |
| 181 | ========== |
| 182 | |
| 183 | - Arch Linux Wiki - https://wiki.archlinux.org/index.php/Systemd/User |
| 184 | - Python interface to journald - |
| 185 | https://www.freedesktop.org/software/systemd/python-systemd/journal.html |
| 186 | - Systemd documentation on service files - |
| 187 | https://www.freedesktop.org/software/systemd/man/systemd.service.html |
| 188 | - Systemd documentation on exec (can be used to impact service runs) - |
| 189 | https://www.freedesktop.org/software/systemd/man/systemd.exec.html |