blob: c9614db8a6d29bd6d0752b9e9547b4d301840969 [file] [log] [blame]
Sean Dague5edae542017-03-21 20:50:24 -04001===========================
2 Using Systemd in DevStack
3===========================
4
Sean Dague8b8441f2017-05-02 06:14:11 -04005By default DevStack is run with all the services as systemd unit
Sean Dague5edae542017-03-21 20:50:24 -04006files. Systemd is now the default init system for nearly every Linux
7distro, and systemd encodes and solves many of the problems related to
8poorly running processes.
9
10Why this instead of screen?
11===========================
12
13The screen model for DevStack was invented when the number of services
14that a DevStack user was going to run was typically < 10. This made
15screen hot keys to jump around very easy. However, the landscape has
16changed (not all services are stoppable in screen as some are under
17Apache, there are typically at least 20 items)
18
19There is also a common developer workflow of changing code in more
20than one service, and needing to restart a bunch of services for that
21to take effect.
22
Sean Dague5edae542017-03-21 20:50:24 -040023Unit Structure
24==============
25
26.. note::
27
28 Originally we actually wanted to do this as user units, however
29 there are issues with running this under non interactive
30 shells. For now, we'll be running as system units. Some user unit
31 code is left in place in case we can switch back later.
32
33All DevStack user units are created as a part of the DevStack slice
Sean Dague8b8441f2017-05-02 06:14:11 -040034given the name ``devstack@$servicename.service``. This makes it easy
35to understand which services are part of the devstack run, and lets us
36disable / stop them in a single command.
Sean Dague5edae542017-03-21 20:50:24 -040037
38Manipulating Units
39==================
40
41Assuming the unit ``n-cpu`` to make the examples more clear.
42
43Enable a unit (allows it to be started)::
44
45 sudo systemctl enable devstack@n-cpu.service
46
47Disable a unit::
48
49 sudo systemctl disable devstack@n-cpu.service
50
51Start a unit::
52
53 sudo systemctl start devstack@n-cpu.service
54
55Stop a unit::
56
57 sudo systemctl stop devstack@n-cpu.service
58
59Restart a unit::
60
61 sudo systemctl restart devstack@n-cpu.service
62
63See status of a unit::
64
65 sudo systemctl status devstack@n-cpu.service
66
Sean Daguedef07b22017-03-30 07:18:49 -040067Operating on more than one unit at a time
68-----------------------------------------
69
70Systemd supports wildcarding for unit operations. To restart every
71service in devstack you can do that following::
72
73 sudo systemctl restart devstack@*
74
75Or to see the status of all Nova processes you can do::
76
77 sudo systemctl status devstack@n-*
78
79We'll eventually make the unit names a bit more meaningful so that
80it's easier to understand what you are restarting.
Sean Dague5edae542017-03-21 20:50:24 -040081
Sean Dague8b8441f2017-05-02 06:14:11 -040082.. _journalctl-examples:
83
Sean Dague5edae542017-03-21 20:50:24 -040084Querying Logs
85=============
86
87One of the other major things that comes with systemd is journald, a
88consolidated way to access logs (including querying through structured
89metadata). This is accessed by the user via ``journalctl`` command.
90
91
92Logs can be accessed through ``journalctl``. journalctl has powerful
93query facilities. We'll start with some common options.
94
95Follow logs for a specific service::
96
Matt Riedemann66a14df2017-09-22 20:51:38 -040097 sudo journalctl -f --unit devstack@n-cpu.service
Sean Dague5edae542017-03-21 20:50:24 -040098
99Following logs for multiple services simultaneously::
100
Matt Riedemann66a14df2017-09-22 20:51:38 -0400101 sudo journalctl -f --unit devstack@n-cpu.service --unit devstack@n-cond.service
Sean Dague5edae542017-03-21 20:50:24 -0400102
Sean Daguedef07b22017-03-30 07:18:49 -0400103or you can even do wild cards to follow all the nova services::
104
Matt Riedemann66a14df2017-09-22 20:51:38 -0400105 sudo journalctl -f --unit devstack@n-*
Sean Daguedef07b22017-03-30 07:18:49 -0400106
Sean Dague5edae542017-03-21 20:50:24 -0400107Use higher precision time stamps::
108
Matt Riedemann66a14df2017-09-22 20:51:38 -0400109 sudo journalctl -f -o short-precise --unit devstack@n-cpu.service
Sean Dague5edae542017-03-21 20:50:24 -0400110
Eric Fried8cd310d2017-05-16 13:52:03 -0500111By default, journalctl strips out "unprintable" characters, including
112ASCII color codes. To keep the color codes (which can be interpreted by
113an appropriate terminal/pager - e.g. ``less``, the default)::
114
Matt Riedemann66a14df2017-09-22 20:51:38 -0400115 sudo journalctl -a --unit devstack@n-cpu.service
Eric Fried8cd310d2017-05-16 13:52:03 -0500116
117When outputting to the terminal using the default pager, long lines
118appear to be truncated, but horizontal scrolling is supported via the
119left/right arrow keys.
120
Matt Riedemann5085dc02017-09-22 20:54:39 -0400121You can pipe the output to another tool, such as ``grep``. For
122example, to find a server instance UUID in the nova logs::
123
124 sudo journalctl -a --unit devstack@n-* | grep 58391b5c-036f-44d5-bd68-21d3c26349e6
125
Eric Fried8cd310d2017-05-16 13:52:03 -0500126See ``man 1 journalctl`` for more.
Sean Dague5edae542017-03-21 20:50:24 -0400127
Eric Fried16ab25c2017-09-07 15:44:34 -0500128Debugging
129=========
130
131Using pdb
132---------
Eric Fried12fcd612017-09-07 13:36:00 -0500133
134In order to break into a regular pdb session on a systemd-controlled
135service, you need to invoke the process manually - that is, take it out
136of systemd's control.
137
138Discover the command systemd is using to run the service::
139
140 systemctl show devstack@n-sch.service -p ExecStart --no-pager
141
142Stop the systemd service::
143
144 sudo systemctl stop devstack@n-sch.service
145
146Inject your breakpoint in the source, e.g.::
147
148 import pdb; pdb.set_trace()
149
150Invoke the command manually::
151
152 /usr/local/bin/nova-scheduler --config-file /etc/nova/nova.conf
153
Eric Fried16ab25c2017-09-07 15:44:34 -0500154Using remote-pdb
155----------------
156
157`remote-pdb`_ works while the process is under systemd control.
158
159Make sure you have remote-pdb installed::
160
161 sudo pip install remote-pdb
162
163Inject your breakpoint in the source, e.g.::
164
165 import remote_pdb; remote_pdb.set_trace()
166
167Restart the relevant service::
168
169 sudo systemctl restart devstack@n-api.service
170
171The remote-pdb code configures the telnet port when ``set_trace()`` is
172invoked. Do whatever it takes to hit the instrumented code path, and
173inspect the logs for a message displaying the listening port::
174
175 Sep 07 16:36:12 p8-100-neo devstack@n-api.service[772]: RemotePdb session open at 127.0.0.1:46771, waiting for connection ...
176
177Telnet to that port to enter the pdb session::
178
179 telnet 127.0.0.1 46771
180
181See the `remote-pdb`_ home page for more options.
182
183.. _`remote-pdb`: https://pypi.python.org/pypi/remote-pdb
184
Sean Dague5edae542017-03-21 20:50:24 -0400185Known Issues
186============
187
188Be careful about systemd python libraries. There are 3 of them on
189pypi, and they are all very different. They unfortunately all install
190into the ``systemd`` namespace, which can cause some issues.
191
192- ``systemd-python`` - this is the upstream maintained library, it has
Sean Dague8b8441f2017-05-02 06:14:11 -0400193 a version number like systemd itself (currently ``234``). This is
Sean Dague5edae542017-03-21 20:50:24 -0400194 the one you want.
195- ``systemd`` - a python 3 only library, not what you want.
196- ``python-systemd`` - another library you don't want. Installing it
197 on a system will break ansible's ability to run.
198
199
200If we were using user units, the ``[Service]`` - ``Group=`` parameter
201doesn't seem to work with user units, even though the documentation
202says that it should. This means that we will need to do an explicit
203``/usr/bin/sg``. This has the downside of making the SYSLOG_IDENTIFIER
204be ``sg``. We can explicitly set that with ``SyslogIdentifier=``, but
205it's really unfortunate that we're going to need this work
206around. This is currently not a problem because we're only using
207system units.
208
209Future Work
210===========
211
Sean Dague5edae542017-03-21 20:50:24 -0400212user units
213----------
214
215It would be great if we could do services as user units, so that there
216is a clear separation of code being run as not root, to ensure running
217as root never accidentally gets baked in as an assumption to
218services. However, user units interact poorly with devstack-gate and
219the way that commands are run as users with ansible and su.
220
221Maybe someday we can figure that out.
222
223References
224==========
225
226- Arch Linux Wiki - https://wiki.archlinux.org/index.php/Systemd/User
227- Python interface to journald -
228 https://www.freedesktop.org/software/systemd/python-systemd/journal.html
229- Systemd documentation on service files -
230 https://www.freedesktop.org/software/systemd/man/systemd.service.html
231- Systemd documentation on exec (can be used to impact service runs) -
232 https://www.freedesktop.org/software/systemd/man/systemd.exec.html