Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 1 | =========================== |
| 2 | Using Systemd in DevStack |
| 3 | =========================== |
| 4 | |
Sean Dague | 8b8441f | 2017-05-02 06:14:11 -0400 | [diff] [blame] | 5 | By default DevStack is run with all the services as systemd unit |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 6 | files. Systemd is now the default init system for nearly every Linux |
| 7 | distro, and systemd encodes and solves many of the problems related to |
| 8 | poorly running processes. |
| 9 | |
| 10 | Why this instead of screen? |
| 11 | =========================== |
| 12 | |
| 13 | The screen model for DevStack was invented when the number of services |
| 14 | that a DevStack user was going to run was typically < 10. This made |
| 15 | screen hot keys to jump around very easy. However, the landscape has |
| 16 | changed (not all services are stoppable in screen as some are under |
| 17 | Apache, there are typically at least 20 items) |
| 18 | |
| 19 | There is also a common developer workflow of changing code in more |
| 20 | than one service, and needing to restart a bunch of services for that |
| 21 | to take effect. |
| 22 | |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 23 | Unit Structure |
| 24 | ============== |
| 25 | |
| 26 | .. note:: |
| 27 | |
| 28 | Originally we actually wanted to do this as user units, however |
| 29 | there are issues with running this under non interactive |
| 30 | shells. For now, we'll be running as system units. Some user unit |
| 31 | code is left in place in case we can switch back later. |
| 32 | |
| 33 | All DevStack user units are created as a part of the DevStack slice |
Sean Dague | 8b8441f | 2017-05-02 06:14:11 -0400 | [diff] [blame] | 34 | given the name ``devstack@$servicename.service``. This makes it easy |
| 35 | to understand which services are part of the devstack run, and lets us |
| 36 | disable / stop them in a single command. |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 37 | |
| 38 | Manipulating Units |
| 39 | ================== |
| 40 | |
| 41 | Assuming the unit ``n-cpu`` to make the examples more clear. |
| 42 | |
| 43 | Enable a unit (allows it to be started):: |
| 44 | |
| 45 | sudo systemctl enable devstack@n-cpu.service |
| 46 | |
| 47 | Disable a unit:: |
| 48 | |
| 49 | sudo systemctl disable devstack@n-cpu.service |
| 50 | |
| 51 | Start a unit:: |
| 52 | |
| 53 | sudo systemctl start devstack@n-cpu.service |
| 54 | |
| 55 | Stop a unit:: |
| 56 | |
| 57 | sudo systemctl stop devstack@n-cpu.service |
| 58 | |
| 59 | Restart a unit:: |
| 60 | |
| 61 | sudo systemctl restart devstack@n-cpu.service |
| 62 | |
| 63 | See status of a unit:: |
| 64 | |
| 65 | sudo systemctl status devstack@n-cpu.service |
| 66 | |
Sean Dague | def07b2 | 2017-03-30 07:18:49 -0400 | [diff] [blame] | 67 | Operating on more than one unit at a time |
| 68 | ----------------------------------------- |
| 69 | |
| 70 | Systemd supports wildcarding for unit operations. To restart every |
| 71 | service in devstack you can do that following:: |
| 72 | |
| 73 | sudo systemctl restart devstack@* |
| 74 | |
| 75 | Or to see the status of all Nova processes you can do:: |
| 76 | |
| 77 | sudo systemctl status devstack@n-* |
| 78 | |
| 79 | We'll eventually make the unit names a bit more meaningful so that |
| 80 | it's easier to understand what you are restarting. |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 81 | |
Sean Dague | 8b8441f | 2017-05-02 06:14:11 -0400 | [diff] [blame] | 82 | .. _journalctl-examples: |
| 83 | |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 84 | Querying Logs |
| 85 | ============= |
| 86 | |
| 87 | One of the other major things that comes with systemd is journald, a |
| 88 | consolidated way to access logs (including querying through structured |
| 89 | metadata). This is accessed by the user via ``journalctl`` command. |
| 90 | |
| 91 | |
| 92 | Logs can be accessed through ``journalctl``. journalctl has powerful |
| 93 | query facilities. We'll start with some common options. |
| 94 | |
| 95 | Follow logs for a specific service:: |
| 96 | |
Matt Riedemann | 66a14df | 2017-09-22 20:51:38 -0400 | [diff] [blame] | 97 | sudo journalctl -f --unit devstack@n-cpu.service |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 98 | |
| 99 | Following logs for multiple services simultaneously:: |
| 100 | |
Matt Riedemann | 66a14df | 2017-09-22 20:51:38 -0400 | [diff] [blame] | 101 | sudo journalctl -f --unit devstack@n-cpu.service --unit devstack@n-cond.service |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 102 | |
Sean Dague | def07b2 | 2017-03-30 07:18:49 -0400 | [diff] [blame] | 103 | or you can even do wild cards to follow all the nova services:: |
| 104 | |
Matt Riedemann | 66a14df | 2017-09-22 20:51:38 -0400 | [diff] [blame] | 105 | sudo journalctl -f --unit devstack@n-* |
Sean Dague | def07b2 | 2017-03-30 07:18:49 -0400 | [diff] [blame] | 106 | |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 107 | Use higher precision time stamps:: |
| 108 | |
Matt Riedemann | 66a14df | 2017-09-22 20:51:38 -0400 | [diff] [blame] | 109 | sudo journalctl -f -o short-precise --unit devstack@n-cpu.service |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 110 | |
Eric Fried | 8cd310d | 2017-05-16 13:52:03 -0500 | [diff] [blame] | 111 | By default, journalctl strips out "unprintable" characters, including |
| 112 | ASCII color codes. To keep the color codes (which can be interpreted by |
| 113 | an appropriate terminal/pager - e.g. ``less``, the default):: |
| 114 | |
Matt Riedemann | 66a14df | 2017-09-22 20:51:38 -0400 | [diff] [blame] | 115 | sudo journalctl -a --unit devstack@n-cpu.service |
Eric Fried | 8cd310d | 2017-05-16 13:52:03 -0500 | [diff] [blame] | 116 | |
| 117 | When outputting to the terminal using the default pager, long lines |
Jens Harbott | 5925169 | 2017-09-28 11:56:40 +0000 | [diff] [blame] | 118 | will be truncated, but horizontal scrolling is supported via the |
| 119 | left/right arrow keys. You can override this by setting the |
| 120 | ``SYSTEMD_LESS`` environment variable to e.g. ``FRXM``. |
Eric Fried | 8cd310d | 2017-05-16 13:52:03 -0500 | [diff] [blame] | 121 | |
Matt Riedemann | 5085dc0 | 2017-09-22 20:54:39 -0400 | [diff] [blame] | 122 | You can pipe the output to another tool, such as ``grep``. For |
| 123 | example, to find a server instance UUID in the nova logs:: |
| 124 | |
| 125 | sudo journalctl -a --unit devstack@n-* | grep 58391b5c-036f-44d5-bd68-21d3c26349e6 |
| 126 | |
Eric Fried | 8cd310d | 2017-05-16 13:52:03 -0500 | [diff] [blame] | 127 | See ``man 1 journalctl`` for more. |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 128 | |
Eric Fried | 16ab25c | 2017-09-07 15:44:34 -0500 | [diff] [blame] | 129 | Debugging |
| 130 | ========= |
| 131 | |
| 132 | Using pdb |
| 133 | --------- |
Eric Fried | 12fcd61 | 2017-09-07 13:36:00 -0500 | [diff] [blame] | 134 | |
| 135 | In order to break into a regular pdb session on a systemd-controlled |
| 136 | service, you need to invoke the process manually - that is, take it out |
| 137 | of systemd's control. |
| 138 | |
| 139 | Discover the command systemd is using to run the service:: |
| 140 | |
| 141 | systemctl show devstack@n-sch.service -p ExecStart --no-pager |
| 142 | |
| 143 | Stop the systemd service:: |
| 144 | |
| 145 | sudo systemctl stop devstack@n-sch.service |
| 146 | |
| 147 | Inject your breakpoint in the source, e.g.:: |
| 148 | |
| 149 | import pdb; pdb.set_trace() |
| 150 | |
| 151 | Invoke the command manually:: |
| 152 | |
| 153 | /usr/local/bin/nova-scheduler --config-file /etc/nova/nova.conf |
| 154 | |
Stephen Finucane | 43f25c0 | 2018-05-31 14:49:59 +0100 | [diff] [blame] | 155 | Some executables, such as :program:`nova-compute`, will need to be executed |
| 156 | with a particular group. This will be shown in the systemd unit file:: |
| 157 | |
| 158 | sudo systemctl cat devstack@n-cpu.service | grep Group |
| 159 | |
| 160 | :: |
| 161 | |
| 162 | Group = libvirt |
| 163 | |
| 164 | Use the :program:`sg` tool to execute the command as this group:: |
| 165 | |
| 166 | sg libvirt -c '/usr/local/bin/nova-compute --config-file /etc/nova/nova-cpu.conf' |
| 167 | |
Eric Fried | 16ab25c | 2017-09-07 15:44:34 -0500 | [diff] [blame] | 168 | Using remote-pdb |
| 169 | ---------------- |
| 170 | |
| 171 | `remote-pdb`_ works while the process is under systemd control. |
| 172 | |
| 173 | Make sure you have remote-pdb installed:: |
| 174 | |
| 175 | sudo pip install remote-pdb |
| 176 | |
| 177 | Inject your breakpoint in the source, e.g.:: |
| 178 | |
| 179 | import remote_pdb; remote_pdb.set_trace() |
| 180 | |
| 181 | Restart the relevant service:: |
| 182 | |
| 183 | sudo systemctl restart devstack@n-api.service |
| 184 | |
| 185 | The remote-pdb code configures the telnet port when ``set_trace()`` is |
| 186 | invoked. Do whatever it takes to hit the instrumented code path, and |
| 187 | inspect the logs for a message displaying the listening port:: |
| 188 | |
| 189 | Sep 07 16:36:12 p8-100-neo devstack@n-api.service[772]: RemotePdb session open at 127.0.0.1:46771, waiting for connection ... |
| 190 | |
| 191 | Telnet to that port to enter the pdb session:: |
| 192 | |
| 193 | telnet 127.0.0.1 46771 |
| 194 | |
| 195 | See the `remote-pdb`_ home page for more options. |
| 196 | |
Andreas Jaeger | 8dd89e5 | 2019-08-11 16:00:12 +0200 | [diff] [blame] | 197 | .. _`remote-pdb`: https://pypi.org/project/remote-pdb/ |
Eric Fried | 16ab25c | 2017-09-07 15:44:34 -0500 | [diff] [blame] | 198 | |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 199 | Future Work |
| 200 | =========== |
| 201 | |
Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 202 | user units |
| 203 | ---------- |
| 204 | |
| 205 | It would be great if we could do services as user units, so that there |
| 206 | is a clear separation of code being run as not root, to ensure running |
| 207 | as root never accidentally gets baked in as an assumption to |
| 208 | services. However, user units interact poorly with devstack-gate and |
| 209 | the way that commands are run as users with ansible and su. |
| 210 | |
| 211 | Maybe someday we can figure that out. |
| 212 | |
| 213 | References |
| 214 | ========== |
| 215 | |
| 216 | - Arch Linux Wiki - https://wiki.archlinux.org/index.php/Systemd/User |
| 217 | - Python interface to journald - |
| 218 | https://www.freedesktop.org/software/systemd/python-systemd/journal.html |
| 219 | - Systemd documentation on service files - |
| 220 | https://www.freedesktop.org/software/systemd/man/systemd.service.html |
| 221 | - Systemd documentation on exec (can be used to impact service runs) - |
| 222 | https://www.freedesktop.org/software/systemd/man/systemd.exec.html |