blob: 9cc401771a9206d3e3ad09ea05188297ecae39f7 [file] [log] [blame]
Sean Dague5edae542017-03-21 20:50:24 -04001===========================
2 Using Systemd in DevStack
3===========================
4
Sean Dague8b8441f2017-05-02 06:14:11 -04005By default DevStack is run with all the services as systemd unit
Sean Dague5edae542017-03-21 20:50:24 -04006files. Systemd is now the default init system for nearly every Linux
7distro, and systemd encodes and solves many of the problems related to
8poorly running processes.
9
10Why this instead of screen?
11===========================
12
13The screen model for DevStack was invented when the number of services
14that a DevStack user was going to run was typically < 10. This made
15screen hot keys to jump around very easy. However, the landscape has
16changed (not all services are stoppable in screen as some are under
17Apache, there are typically at least 20 items)
18
19There is also a common developer workflow of changing code in more
20than one service, and needing to restart a bunch of services for that
21to take effect.
22
Sean Dague5edae542017-03-21 20:50:24 -040023Unit Structure
24==============
25
26.. note::
27
28 Originally we actually wanted to do this as user units, however
29 there are issues with running this under non interactive
30 shells. For now, we'll be running as system units. Some user unit
31 code is left in place in case we can switch back later.
32
33All DevStack user units are created as a part of the DevStack slice
Sean Dague8b8441f2017-05-02 06:14:11 -040034given the name ``devstack@$servicename.service``. This makes it easy
35to understand which services are part of the devstack run, and lets us
36disable / stop them in a single command.
Sean Dague5edae542017-03-21 20:50:24 -040037
38Manipulating Units
39==================
40
41Assuming the unit ``n-cpu`` to make the examples more clear.
42
43Enable a unit (allows it to be started)::
44
45 sudo systemctl enable devstack@n-cpu.service
46
47Disable a unit::
48
49 sudo systemctl disable devstack@n-cpu.service
50
51Start a unit::
52
53 sudo systemctl start devstack@n-cpu.service
54
55Stop a unit::
56
57 sudo systemctl stop devstack@n-cpu.service
58
59Restart a unit::
60
61 sudo systemctl restart devstack@n-cpu.service
62
63See status of a unit::
64
65 sudo systemctl status devstack@n-cpu.service
66
Sean Daguedef07b22017-03-30 07:18:49 -040067Operating on more than one unit at a time
68-----------------------------------------
69
70Systemd supports wildcarding for unit operations. To restart every
71service in devstack you can do that following::
72
73 sudo systemctl restart devstack@*
74
75Or to see the status of all Nova processes you can do::
76
77 sudo systemctl status devstack@n-*
78
79We'll eventually make the unit names a bit more meaningful so that
80it's easier to understand what you are restarting.
Sean Dague5edae542017-03-21 20:50:24 -040081
Sean Dague8b8441f2017-05-02 06:14:11 -040082.. _journalctl-examples:
83
Sean Dague5edae542017-03-21 20:50:24 -040084Querying Logs
85=============
86
87One of the other major things that comes with systemd is journald, a
88consolidated way to access logs (including querying through structured
89metadata). This is accessed by the user via ``journalctl`` command.
90
91
92Logs can be accessed through ``journalctl``. journalctl has powerful
93query facilities. We'll start with some common options.
94
95Follow logs for a specific service::
96
Matt Riedemann66a14df2017-09-22 20:51:38 -040097 sudo journalctl -f --unit devstack@n-cpu.service
Sean Dague5edae542017-03-21 20:50:24 -040098
99Following logs for multiple services simultaneously::
100
Matt Riedemann66a14df2017-09-22 20:51:38 -0400101 sudo journalctl -f --unit devstack@n-cpu.service --unit devstack@n-cond.service
Sean Dague5edae542017-03-21 20:50:24 -0400102
Sean Daguedef07b22017-03-30 07:18:49 -0400103or you can even do wild cards to follow all the nova services::
104
Matt Riedemann66a14df2017-09-22 20:51:38 -0400105 sudo journalctl -f --unit devstack@n-*
Sean Daguedef07b22017-03-30 07:18:49 -0400106
Sean Dague5edae542017-03-21 20:50:24 -0400107Use higher precision time stamps::
108
Matt Riedemann66a14df2017-09-22 20:51:38 -0400109 sudo journalctl -f -o short-precise --unit devstack@n-cpu.service
Sean Dague5edae542017-03-21 20:50:24 -0400110
Eric Fried8cd310d2017-05-16 13:52:03 -0500111By default, journalctl strips out "unprintable" characters, including
112ASCII color codes. To keep the color codes (which can be interpreted by
113an appropriate terminal/pager - e.g. ``less``, the default)::
114
Matt Riedemann66a14df2017-09-22 20:51:38 -0400115 sudo journalctl -a --unit devstack@n-cpu.service
Eric Fried8cd310d2017-05-16 13:52:03 -0500116
117When outputting to the terminal using the default pager, long lines
Jens Harbott59251692017-09-28 11:56:40 +0000118will be truncated, but horizontal scrolling is supported via the
119left/right arrow keys. You can override this by setting the
120``SYSTEMD_LESS`` environment variable to e.g. ``FRXM``.
Eric Fried8cd310d2017-05-16 13:52:03 -0500121
Matt Riedemann5085dc02017-09-22 20:54:39 -0400122You can pipe the output to another tool, such as ``grep``. For
123example, to find a server instance UUID in the nova logs::
124
125 sudo journalctl -a --unit devstack@n-* | grep 58391b5c-036f-44d5-bd68-21d3c26349e6
126
Eric Fried8cd310d2017-05-16 13:52:03 -0500127See ``man 1 journalctl`` for more.
Sean Dague5edae542017-03-21 20:50:24 -0400128
Eric Fried16ab25c2017-09-07 15:44:34 -0500129Debugging
130=========
131
132Using pdb
133---------
Eric Fried12fcd612017-09-07 13:36:00 -0500134
135In order to break into a regular pdb session on a systemd-controlled
136service, you need to invoke the process manually - that is, take it out
137of systemd's control.
138
139Discover the command systemd is using to run the service::
140
141 systemctl show devstack@n-sch.service -p ExecStart --no-pager
142
143Stop the systemd service::
144
145 sudo systemctl stop devstack@n-sch.service
146
147Inject your breakpoint in the source, e.g.::
148
149 import pdb; pdb.set_trace()
150
151Invoke the command manually::
152
153 /usr/local/bin/nova-scheduler --config-file /etc/nova/nova.conf
154
Eric Fried16ab25c2017-09-07 15:44:34 -0500155Using remote-pdb
156----------------
157
158`remote-pdb`_ works while the process is under systemd control.
159
160Make sure you have remote-pdb installed::
161
162 sudo pip install remote-pdb
163
164Inject your breakpoint in the source, e.g.::
165
166 import remote_pdb; remote_pdb.set_trace()
167
168Restart the relevant service::
169
170 sudo systemctl restart devstack@n-api.service
171
172The remote-pdb code configures the telnet port when ``set_trace()`` is
173invoked. Do whatever it takes to hit the instrumented code path, and
174inspect the logs for a message displaying the listening port::
175
176 Sep 07 16:36:12 p8-100-neo devstack@n-api.service[772]: RemotePdb session open at 127.0.0.1:46771, waiting for connection ...
177
178Telnet to that port to enter the pdb session::
179
180 telnet 127.0.0.1 46771
181
182See the `remote-pdb`_ home page for more options.
183
184.. _`remote-pdb`: https://pypi.python.org/pypi/remote-pdb
185
Sean Dague5edae542017-03-21 20:50:24 -0400186Known Issues
187============
188
189Be careful about systemd python libraries. There are 3 of them on
190pypi, and they are all very different. They unfortunately all install
191into the ``systemd`` namespace, which can cause some issues.
192
193- ``systemd-python`` - this is the upstream maintained library, it has
Sean Dague8b8441f2017-05-02 06:14:11 -0400194 a version number like systemd itself (currently ``234``). This is
Sean Dague5edae542017-03-21 20:50:24 -0400195 the one you want.
196- ``systemd`` - a python 3 only library, not what you want.
197- ``python-systemd`` - another library you don't want. Installing it
198 on a system will break ansible's ability to run.
199
200
201If we were using user units, the ``[Service]`` - ``Group=`` parameter
202doesn't seem to work with user units, even though the documentation
203says that it should. This means that we will need to do an explicit
204``/usr/bin/sg``. This has the downside of making the SYSLOG_IDENTIFIER
205be ``sg``. We can explicitly set that with ``SyslogIdentifier=``, but
206it's really unfortunate that we're going to need this work
207around. This is currently not a problem because we're only using
208system units.
209
210Future Work
211===========
212
Sean Dague5edae542017-03-21 20:50:24 -0400213user units
214----------
215
216It would be great if we could do services as user units, so that there
217is a clear separation of code being run as not root, to ensure running
218as root never accidentally gets baked in as an assumption to
219services. However, user units interact poorly with devstack-gate and
220the way that commands are run as users with ansible and su.
221
222Maybe someday we can figure that out.
223
224References
225==========
226
227- Arch Linux Wiki - https://wiki.archlinux.org/index.php/Systemd/User
228- Python interface to journald -
229 https://www.freedesktop.org/software/systemd/python-systemd/journal.html
230- Systemd documentation on service files -
231 https://www.freedesktop.org/software/systemd/man/systemd.service.html
232- Systemd documentation on exec (can be used to impact service runs) -
233 https://www.freedesktop.org/software/systemd/man/systemd.exec.html