| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 1 | =========================== | 
|  | 2 | Using Systemd in DevStack | 
|  | 3 | =========================== | 
|  | 4 |  | 
| Sean Dague | 8b8441f | 2017-05-02 06:14:11 -0400 | [diff] [blame] | 5 | By default DevStack is run with all the services as systemd unit | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 6 | files. Systemd is now the default init system for nearly every Linux | 
|  | 7 | distro, and systemd encodes and solves many of the problems related to | 
|  | 8 | poorly running processes. | 
|  | 9 |  | 
|  | 10 | Why this instead of screen? | 
|  | 11 | =========================== | 
|  | 12 |  | 
|  | 13 | The screen model for DevStack was invented when the number of services | 
|  | 14 | that a DevStack user was going to run was typically < 10. This made | 
|  | 15 | screen hot keys to jump around very easy. However, the landscape has | 
|  | 16 | changed (not all services are stoppable in screen as some are under | 
|  | 17 | Apache, there are typically at least 20 items) | 
|  | 18 |  | 
|  | 19 | There is also a common developer workflow of changing code in more | 
|  | 20 | than one service, and needing to restart a bunch of services for that | 
|  | 21 | to take effect. | 
|  | 22 |  | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 23 | Unit Structure | 
|  | 24 | ============== | 
|  | 25 |  | 
|  | 26 | .. note:: | 
|  | 27 |  | 
|  | 28 | Originally we actually wanted to do this as user units, however | 
|  | 29 | there are issues with running this under non interactive | 
|  | 30 | shells. For now, we'll be running as system units. Some user unit | 
|  | 31 | code is left in place in case we can switch back later. | 
|  | 32 |  | 
|  | 33 | All DevStack user units are created as a part of the DevStack slice | 
| Sean Dague | 8b8441f | 2017-05-02 06:14:11 -0400 | [diff] [blame] | 34 | given the name ``devstack@$servicename.service``. This makes it easy | 
|  | 35 | to understand which services are part of the devstack run, and lets us | 
|  | 36 | disable / stop them in a single command. | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 37 |  | 
|  | 38 | Manipulating Units | 
|  | 39 | ================== | 
|  | 40 |  | 
|  | 41 | Assuming the unit ``n-cpu`` to make the examples more clear. | 
|  | 42 |  | 
|  | 43 | Enable a unit (allows it to be started):: | 
|  | 44 |  | 
|  | 45 | sudo systemctl enable devstack@n-cpu.service | 
|  | 46 |  | 
|  | 47 | Disable a unit:: | 
|  | 48 |  | 
|  | 49 | sudo systemctl disable devstack@n-cpu.service | 
|  | 50 |  | 
|  | 51 | Start a unit:: | 
|  | 52 |  | 
|  | 53 | sudo systemctl start devstack@n-cpu.service | 
|  | 54 |  | 
|  | 55 | Stop a unit:: | 
|  | 56 |  | 
|  | 57 | sudo systemctl stop devstack@n-cpu.service | 
|  | 58 |  | 
|  | 59 | Restart a unit:: | 
|  | 60 |  | 
|  | 61 | sudo systemctl restart devstack@n-cpu.service | 
|  | 62 |  | 
|  | 63 | See status of a unit:: | 
|  | 64 |  | 
|  | 65 | sudo systemctl status devstack@n-cpu.service | 
|  | 66 |  | 
| Sean Dague | def07b2 | 2017-03-30 07:18:49 -0400 | [diff] [blame] | 67 | Operating on more than one unit at a time | 
|  | 68 | ----------------------------------------- | 
|  | 69 |  | 
|  | 70 | Systemd supports wildcarding for unit operations. To restart every | 
|  | 71 | service in devstack you can do that following:: | 
|  | 72 |  | 
|  | 73 | sudo systemctl restart devstack@* | 
|  | 74 |  | 
|  | 75 | Or to see the status of all Nova processes you can do:: | 
|  | 76 |  | 
|  | 77 | sudo systemctl status devstack@n-* | 
|  | 78 |  | 
|  | 79 | We'll eventually make the unit names a bit more meaningful so that | 
|  | 80 | it's easier to understand what you are restarting. | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 81 |  | 
| Sean Dague | 8b8441f | 2017-05-02 06:14:11 -0400 | [diff] [blame] | 82 | .. _journalctl-examples: | 
|  | 83 |  | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 84 | Querying Logs | 
|  | 85 | ============= | 
|  | 86 |  | 
|  | 87 | One of the other major things that comes with systemd is journald, a | 
|  | 88 | consolidated way to access logs (including querying through structured | 
|  | 89 | metadata). This is accessed by the user via ``journalctl`` command. | 
|  | 90 |  | 
|  | 91 |  | 
|  | 92 | Logs can be accessed through ``journalctl``. journalctl has powerful | 
|  | 93 | query facilities. We'll start with some common options. | 
|  | 94 |  | 
|  | 95 | Follow logs for a specific service:: | 
|  | 96 |  | 
| Matt Riedemann | 66a14df | 2017-09-22 20:51:38 -0400 | [diff] [blame] | 97 | sudo journalctl -f --unit devstack@n-cpu.service | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 98 |  | 
|  | 99 | Following logs for multiple services simultaneously:: | 
|  | 100 |  | 
| Matt Riedemann | 66a14df | 2017-09-22 20:51:38 -0400 | [diff] [blame] | 101 | sudo journalctl -f --unit devstack@n-cpu.service --unit devstack@n-cond.service | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 102 |  | 
| Sean Dague | def07b2 | 2017-03-30 07:18:49 -0400 | [diff] [blame] | 103 | or you can even do wild cards to follow all the nova services:: | 
|  | 104 |  | 
| Matt Riedemann | 66a14df | 2017-09-22 20:51:38 -0400 | [diff] [blame] | 105 | sudo journalctl -f --unit devstack@n-* | 
| Sean Dague | def07b2 | 2017-03-30 07:18:49 -0400 | [diff] [blame] | 106 |  | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 107 | Use higher precision time stamps:: | 
|  | 108 |  | 
| Matt Riedemann | 66a14df | 2017-09-22 20:51:38 -0400 | [diff] [blame] | 109 | sudo journalctl -f -o short-precise --unit devstack@n-cpu.service | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 110 |  | 
| Eric Fried | 8cd310d | 2017-05-16 13:52:03 -0500 | [diff] [blame] | 111 | By default, journalctl strips out "unprintable" characters, including | 
|  | 112 | ASCII color codes. To keep the color codes (which can be interpreted by | 
|  | 113 | an appropriate terminal/pager - e.g. ``less``, the default):: | 
|  | 114 |  | 
| Matt Riedemann | 66a14df | 2017-09-22 20:51:38 -0400 | [diff] [blame] | 115 | sudo journalctl -a --unit devstack@n-cpu.service | 
| Eric Fried | 8cd310d | 2017-05-16 13:52:03 -0500 | [diff] [blame] | 116 |  | 
|  | 117 | When outputting to the terminal using the default pager, long lines | 
| Jens Harbott | 5925169 | 2017-09-28 11:56:40 +0000 | [diff] [blame] | 118 | will be truncated, but horizontal scrolling is supported via the | 
|  | 119 | left/right arrow keys. You can override this by setting the | 
|  | 120 | ``SYSTEMD_LESS`` environment variable to e.g. ``FRXM``. | 
| Eric Fried | 8cd310d | 2017-05-16 13:52:03 -0500 | [diff] [blame] | 121 |  | 
| Matt Riedemann | 5085dc0 | 2017-09-22 20:54:39 -0400 | [diff] [blame] | 122 | You can pipe the output to another tool, such as ``grep``. For | 
|  | 123 | example, to find a server instance UUID in the nova logs:: | 
|  | 124 |  | 
|  | 125 | sudo journalctl -a --unit devstack@n-* | grep 58391b5c-036f-44d5-bd68-21d3c26349e6 | 
|  | 126 |  | 
| Eric Fried | 8cd310d | 2017-05-16 13:52:03 -0500 | [diff] [blame] | 127 | See ``man 1 journalctl`` for more. | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 128 |  | 
| Eric Fried | 16ab25c | 2017-09-07 15:44:34 -0500 | [diff] [blame] | 129 | Debugging | 
|  | 130 | ========= | 
|  | 131 |  | 
|  | 132 | Using pdb | 
|  | 133 | --------- | 
| Eric Fried | 12fcd61 | 2017-09-07 13:36:00 -0500 | [diff] [blame] | 134 |  | 
|  | 135 | In order to break into a regular pdb session on a systemd-controlled | 
|  | 136 | service, you need to invoke the process manually - that is, take it out | 
|  | 137 | of systemd's control. | 
|  | 138 |  | 
|  | 139 | Discover the command systemd is using to run the service:: | 
|  | 140 |  | 
|  | 141 | systemctl show devstack@n-sch.service -p ExecStart --no-pager | 
|  | 142 |  | 
|  | 143 | Stop the systemd service:: | 
|  | 144 |  | 
|  | 145 | sudo systemctl stop devstack@n-sch.service | 
|  | 146 |  | 
|  | 147 | Inject your breakpoint in the source, e.g.:: | 
|  | 148 |  | 
|  | 149 | import pdb; pdb.set_trace() | 
|  | 150 |  | 
|  | 151 | Invoke the command manually:: | 
|  | 152 |  | 
|  | 153 | /usr/local/bin/nova-scheduler --config-file /etc/nova/nova.conf | 
|  | 154 |  | 
| Eric Fried | 16ab25c | 2017-09-07 15:44:34 -0500 | [diff] [blame] | 155 | Using remote-pdb | 
|  | 156 | ---------------- | 
|  | 157 |  | 
|  | 158 | `remote-pdb`_ works while the process is under systemd control. | 
|  | 159 |  | 
|  | 160 | Make sure you have remote-pdb installed:: | 
|  | 161 |  | 
|  | 162 | sudo pip install remote-pdb | 
|  | 163 |  | 
|  | 164 | Inject your breakpoint in the source, e.g.:: | 
|  | 165 |  | 
|  | 166 | import remote_pdb; remote_pdb.set_trace() | 
|  | 167 |  | 
|  | 168 | Restart the relevant service:: | 
|  | 169 |  | 
|  | 170 | sudo systemctl restart devstack@n-api.service | 
|  | 171 |  | 
|  | 172 | The remote-pdb code configures the telnet port when ``set_trace()`` is | 
|  | 173 | invoked.  Do whatever it takes to hit the instrumented code path, and | 
|  | 174 | inspect the logs for a message displaying the listening port:: | 
|  | 175 |  | 
|  | 176 | Sep 07 16:36:12 p8-100-neo devstack@n-api.service[772]: RemotePdb session open at 127.0.0.1:46771, waiting for connection ... | 
|  | 177 |  | 
|  | 178 | Telnet to that port to enter the pdb session:: | 
|  | 179 |  | 
|  | 180 | telnet 127.0.0.1 46771 | 
|  | 181 |  | 
|  | 182 | See the `remote-pdb`_ home page for more options. | 
|  | 183 |  | 
|  | 184 | .. _`remote-pdb`: https://pypi.python.org/pypi/remote-pdb | 
|  | 185 |  | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 186 | Known Issues | 
|  | 187 | ============ | 
|  | 188 |  | 
|  | 189 | Be careful about systemd python libraries. There are 3 of them on | 
|  | 190 | pypi, and they are all very different. They unfortunately all install | 
|  | 191 | into the ``systemd`` namespace, which can cause some issues. | 
|  | 192 |  | 
|  | 193 | - ``systemd-python`` - this is the upstream maintained library, it has | 
| Sean Dague | 8b8441f | 2017-05-02 06:14:11 -0400 | [diff] [blame] | 194 | a version number like systemd itself (currently ``234``). This is | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 195 | the one you want. | 
|  | 196 | - ``systemd`` - a python 3 only library, not what you want. | 
|  | 197 | - ``python-systemd`` - another library you don't want. Installing it | 
|  | 198 | on a system will break ansible's ability to run. | 
|  | 199 |  | 
|  | 200 |  | 
|  | 201 | If we were using user units, the ``[Service]`` - ``Group=`` parameter | 
|  | 202 | doesn't seem to work with user units, even though the documentation | 
|  | 203 | says that it should. This means that we will need to do an explicit | 
|  | 204 | ``/usr/bin/sg``. This has the downside of making the SYSLOG_IDENTIFIER | 
|  | 205 | be ``sg``. We can explicitly set that with ``SyslogIdentifier=``, but | 
|  | 206 | it's really unfortunate that we're going to need this work | 
|  | 207 | around. This is currently not a problem because we're only using | 
|  | 208 | system units. | 
|  | 209 |  | 
|  | 210 | Future Work | 
|  | 211 | =========== | 
|  | 212 |  | 
| Sean Dague | 5edae54 | 2017-03-21 20:50:24 -0400 | [diff] [blame] | 213 | user units | 
|  | 214 | ---------- | 
|  | 215 |  | 
|  | 216 | It would be great if we could do services as user units, so that there | 
|  | 217 | is a clear separation of code being run as not root, to ensure running | 
|  | 218 | as root never accidentally gets baked in as an assumption to | 
|  | 219 | services. However, user units interact poorly with devstack-gate and | 
|  | 220 | the way that commands are run as users with ansible and su. | 
|  | 221 |  | 
|  | 222 | Maybe someday we can figure that out. | 
|  | 223 |  | 
|  | 224 | References | 
|  | 225 | ========== | 
|  | 226 |  | 
|  | 227 | - Arch Linux Wiki - https://wiki.archlinux.org/index.php/Systemd/User | 
|  | 228 | - Python interface to journald - | 
|  | 229 | https://www.freedesktop.org/software/systemd/python-systemd/journal.html | 
|  | 230 | - Systemd documentation on service files - | 
|  | 231 | https://www.freedesktop.org/software/systemd/man/systemd.service.html | 
|  | 232 | - Systemd documentation on exec (can be used to impact service runs) - | 
|  | 233 | https://www.freedesktop.org/software/systemd/man/systemd.exec.html |