blob: 523d399c628c6030c63c02480bcff1281ec6ff56 [file] [log] [blame]
Sean Dague5edae542017-03-21 20:50:24 -04001===========================
2 Using Systemd in DevStack
3===========================
4
Sean Dague8b8441f2017-05-02 06:14:11 -04005By default DevStack is run with all the services as systemd unit
Sean Dague5edae542017-03-21 20:50:24 -04006files. Systemd is now the default init system for nearly every Linux
7distro, and systemd encodes and solves many of the problems related to
8poorly running processes.
9
10Why this instead of screen?
11===========================
12
13The screen model for DevStack was invented when the number of services
14that a DevStack user was going to run was typically < 10. This made
15screen hot keys to jump around very easy. However, the landscape has
16changed (not all services are stoppable in screen as some are under
17Apache, there are typically at least 20 items)
18
19There is also a common developer workflow of changing code in more
20than one service, and needing to restart a bunch of services for that
21to take effect.
22
Sean Dague5edae542017-03-21 20:50:24 -040023Unit Structure
24==============
25
26.. note::
27
28 Originally we actually wanted to do this as user units, however
29 there are issues with running this under non interactive
30 shells. For now, we'll be running as system units. Some user unit
31 code is left in place in case we can switch back later.
32
33All DevStack user units are created as a part of the DevStack slice
Sean Dague8b8441f2017-05-02 06:14:11 -040034given the name ``devstack@$servicename.service``. This makes it easy
35to understand which services are part of the devstack run, and lets us
36disable / stop them in a single command.
Sean Dague5edae542017-03-21 20:50:24 -040037
38Manipulating Units
39==================
40
41Assuming the unit ``n-cpu`` to make the examples more clear.
42
43Enable a unit (allows it to be started)::
44
45 sudo systemctl enable devstack@n-cpu.service
46
47Disable a unit::
48
49 sudo systemctl disable devstack@n-cpu.service
50
51Start a unit::
52
53 sudo systemctl start devstack@n-cpu.service
54
55Stop a unit::
56
57 sudo systemctl stop devstack@n-cpu.service
58
59Restart a unit::
60
61 sudo systemctl restart devstack@n-cpu.service
62
63See status of a unit::
64
65 sudo systemctl status devstack@n-cpu.service
66
Sean Daguedef07b22017-03-30 07:18:49 -040067Operating on more than one unit at a time
68-----------------------------------------
69
70Systemd supports wildcarding for unit operations. To restart every
71service in devstack you can do that following::
72
73 sudo systemctl restart devstack@*
74
75Or to see the status of all Nova processes you can do::
76
77 sudo systemctl status devstack@n-*
78
79We'll eventually make the unit names a bit more meaningful so that
80it's easier to understand what you are restarting.
Sean Dague5edae542017-03-21 20:50:24 -040081
Sean Dague8b8441f2017-05-02 06:14:11 -040082.. _journalctl-examples:
83
Sean Dague5edae542017-03-21 20:50:24 -040084Querying Logs
85=============
86
87One of the other major things that comes with systemd is journald, a
88consolidated way to access logs (including querying through structured
89metadata). This is accessed by the user via ``journalctl`` command.
90
91
92Logs can be accessed through ``journalctl``. journalctl has powerful
93query facilities. We'll start with some common options.
94
95Follow logs for a specific service::
96
Matt Riedemann66a14df2017-09-22 20:51:38 -040097 sudo journalctl -f --unit devstack@n-cpu.service
Sean Dague5edae542017-03-21 20:50:24 -040098
99Following logs for multiple services simultaneously::
100
Matt Riedemann66a14df2017-09-22 20:51:38 -0400101 sudo journalctl -f --unit devstack@n-cpu.service --unit devstack@n-cond.service
Sean Dague5edae542017-03-21 20:50:24 -0400102
Sean Daguedef07b22017-03-30 07:18:49 -0400103or you can even do wild cards to follow all the nova services::
104
Matt Riedemann66a14df2017-09-22 20:51:38 -0400105 sudo journalctl -f --unit devstack@n-*
Sean Daguedef07b22017-03-30 07:18:49 -0400106
Sean Dague5edae542017-03-21 20:50:24 -0400107Use higher precision time stamps::
108
Matt Riedemann66a14df2017-09-22 20:51:38 -0400109 sudo journalctl -f -o short-precise --unit devstack@n-cpu.service
Sean Dague5edae542017-03-21 20:50:24 -0400110
Eric Fried8cd310d2017-05-16 13:52:03 -0500111By default, journalctl strips out "unprintable" characters, including
112ASCII color codes. To keep the color codes (which can be interpreted by
113an appropriate terminal/pager - e.g. ``less``, the default)::
114
Matt Riedemann66a14df2017-09-22 20:51:38 -0400115 sudo journalctl -a --unit devstack@n-cpu.service
Eric Fried8cd310d2017-05-16 13:52:03 -0500116
117When outputting to the terminal using the default pager, long lines
118appear to be truncated, but horizontal scrolling is supported via the
119left/right arrow keys.
120
121See ``man 1 journalctl`` for more.
Sean Dague5edae542017-03-21 20:50:24 -0400122
Eric Fried16ab25c2017-09-07 15:44:34 -0500123Debugging
124=========
125
126Using pdb
127---------
Eric Fried12fcd612017-09-07 13:36:00 -0500128
129In order to break into a regular pdb session on a systemd-controlled
130service, you need to invoke the process manually - that is, take it out
131of systemd's control.
132
133Discover the command systemd is using to run the service::
134
135 systemctl show devstack@n-sch.service -p ExecStart --no-pager
136
137Stop the systemd service::
138
139 sudo systemctl stop devstack@n-sch.service
140
141Inject your breakpoint in the source, e.g.::
142
143 import pdb; pdb.set_trace()
144
145Invoke the command manually::
146
147 /usr/local/bin/nova-scheduler --config-file /etc/nova/nova.conf
148
Eric Fried16ab25c2017-09-07 15:44:34 -0500149Using remote-pdb
150----------------
151
152`remote-pdb`_ works while the process is under systemd control.
153
154Make sure you have remote-pdb installed::
155
156 sudo pip install remote-pdb
157
158Inject your breakpoint in the source, e.g.::
159
160 import remote_pdb; remote_pdb.set_trace()
161
162Restart the relevant service::
163
164 sudo systemctl restart devstack@n-api.service
165
166The remote-pdb code configures the telnet port when ``set_trace()`` is
167invoked. Do whatever it takes to hit the instrumented code path, and
168inspect the logs for a message displaying the listening port::
169
170 Sep 07 16:36:12 p8-100-neo devstack@n-api.service[772]: RemotePdb session open at 127.0.0.1:46771, waiting for connection ...
171
172Telnet to that port to enter the pdb session::
173
174 telnet 127.0.0.1 46771
175
176See the `remote-pdb`_ home page for more options.
177
178.. _`remote-pdb`: https://pypi.python.org/pypi/remote-pdb
179
Sean Dague5edae542017-03-21 20:50:24 -0400180Known Issues
181============
182
183Be careful about systemd python libraries. There are 3 of them on
184pypi, and they are all very different. They unfortunately all install
185into the ``systemd`` namespace, which can cause some issues.
186
187- ``systemd-python`` - this is the upstream maintained library, it has
Sean Dague8b8441f2017-05-02 06:14:11 -0400188 a version number like systemd itself (currently ``234``). This is
Sean Dague5edae542017-03-21 20:50:24 -0400189 the one you want.
190- ``systemd`` - a python 3 only library, not what you want.
191- ``python-systemd`` - another library you don't want. Installing it
192 on a system will break ansible's ability to run.
193
194
195If we were using user units, the ``[Service]`` - ``Group=`` parameter
196doesn't seem to work with user units, even though the documentation
197says that it should. This means that we will need to do an explicit
198``/usr/bin/sg``. This has the downside of making the SYSLOG_IDENTIFIER
199be ``sg``. We can explicitly set that with ``SyslogIdentifier=``, but
200it's really unfortunate that we're going to need this work
201around. This is currently not a problem because we're only using
202system units.
203
204Future Work
205===========
206
Sean Dague5edae542017-03-21 20:50:24 -0400207user units
208----------
209
210It would be great if we could do services as user units, so that there
211is a clear separation of code being run as not root, to ensure running
212as root never accidentally gets baked in as an assumption to
213services. However, user units interact poorly with devstack-gate and
214the way that commands are run as users with ansible and su.
215
216Maybe someday we can figure that out.
217
218References
219==========
220
221- Arch Linux Wiki - https://wiki.archlinux.org/index.php/Systemd/User
222- Python interface to journald -
223 https://www.freedesktop.org/software/systemd/python-systemd/journal.html
224- Systemd documentation on service files -
225 https://www.freedesktop.org/software/systemd/man/systemd.service.html
226- Systemd documentation on exec (can be used to impact service runs) -
227 https://www.freedesktop.org/software/systemd/man/systemd.exec.html