1
0
mirror of /repos/baseimage-docker.git synced 2025-12-30 08:01:31 +01:00

Improve the init system: support skipping startup files and running a custom main command

This commit is contained in:
Hongli Lai (Phusion) 2014-02-15 10:17:25 +01:00
parent d884118827
commit b6dac86e04
3 changed files with 247 additions and 80 deletions

View File

@ -1,6 +1,9 @@
## 0.9.6
* Fixed a bug in `my_init`: child processes that have been adopted during execution of init scripts are now properly reaped.
* Much improved `my_init`:
* It is now possible to run and watch a custom command, possibly in addition to running runit. See "Running a one-shot command in the container" in the README.
* It is now possible to skip running startup files such as /etc/rc.local.
## 0.9.5 (release date: 2014-02-06)

View File

@ -40,6 +40,7 @@ You can configure the stock `ubuntu` image yourself from your Dockerfile, so why
* [Getting started](#getting_started)
* [Adding additional daemons](#adding_additional_daemons)
* [Running scripts during container startup](#running_startup_scripts)
* [Running a one-shot command in the container](#oneshot)
* [Login to the container via SSH](#login)
* [Building the image yourself](#building)
* [Conclusion](#conclusion)
@ -154,6 +155,50 @@ The following example shows how you can add a startup script. This script simply
RUN mkdir -p /etc/my_init.d
ADD logtime.sh /etc/my_init.d/logtime.sh
<a name="oneshot"></a>
### Running a one-shot command in the container
Normally, when you want to run a single command in a container, and exit immediately after the command, you invoke Docker like this:
docker run YOUR_IMAGE COMMAND ARGUMENTS...
However the downside of this approach is that the init system is not started. That is, while invoking `COMMAND`, important daemons such as cron and syslog are not running. Also, orphaned child processes are not properly reaped, because `COMMAND` is PID 1.
Baseimage-docker provides a facility to run a single one-shot command, while solving all of the aforementioned problems. Run a single command in the following manner:
docker run YOUR_IMAGE /sbin/my_init -- COMMAND ARGUMENTS ...
This will perform the following:
* Runs all system startup files, such as /etc/my_init.d/* and /etc/rc.local.
* Starts all runit services.
* Runs the specified command.
* When the specified command exits, stops all runit services.
For example:
$ docker run phusion/baseimage:<VERSION> /sbin/my_init -- ls
*** Running /etc/my_init.d/00_regen_ssh_host_keys.sh...
No SSH host key available. Generating one...
Creating SSH2 RSA key; this may take some time ...
Creating SSH2 DSA key; this may take some time ...
Creating SSH2 ECDSA key; this may take some time ...
*** Running /etc/rc.local...
*** Booting runit daemon...
*** Runit started as PID 80
*** Running ls...
bin boot dev etc home image lib lib64 media mnt opt proc root run sbin selinux srv sys tmp usr var
*** ls exited with exit code 0.
*** Shutting down runit daemon (PID 80)...
*** Killing all processes...
You may find that the default invocation is too noisy. Or perhaps you don't want to run the startup files. You can customize all this by passing arguments to `my_init`. Invoke `docker run YOUR_IMAGE /sbin/my_init --help` for more information.
The following example runs `ls` without running the startup files and with less messages, while running all runit services:
$ docker run phusion/baseimage:<VERSION> /sbin/my_init --skip-startup-files --quiet -- ls
bin boot dev etc home image lib lib64 media mnt opt proc root run sbin selinux srv sys tmp usr var
<a name="login"></a>
### Login to the container via SSH

View File

@ -1,8 +1,42 @@
#!/usr/bin/python2
import os, sys, stat, signal, errno
import os, sys, stat, signal, errno, argparse, time
pid = None
status = None
KILL_PROCESS_TIMEOUT = 5
KILL_ALL_PROCESSES_TIMEOUT = 5
LOG_LEVEL_ERROR = 1
LOG_LEVEL_WARN = 1
LOG_LEVEL_INFO = 2
LOG_LEVEL_DEBUG = 3
log_level = None
class AlarmException(Exception):
pass
def error(message):
if log_level >= LOG_LEVEL_ERROR:
sys.stderr.write("*** %s\n" % message)
def warn(message):
if log_level >= LOG_LEVEL_WARN:
print("*** %s" % message)
def info(message):
if log_level >= LOG_LEVEL_INFO:
print("*** %s" % message)
def debug(message):
if log_level >= LOG_LEVEL_DEBUG:
print("*** %s" % message)
def ignore_signals_and_raise_keyboard_interrupt(signame):
signal.signal(signal.SIGTERM, signal.SIG_IGN)
signal.signal(signal.SIGINT, signal.SIG_IGN)
raise KeyboardInterrupt(signame)
def raise_alarm_exception():
raise AlarmException('Alarm')
def listdir(path):
try:
@ -20,100 +54,185 @@ def is_exe(path):
except OSError:
return False
def reap_child(signum, frame):
global pid, status, waiting_for_runit
try:
result = os.wait3(os.WNOHANG)
if result is not None and pid == result[0]:
status = result[1]
except OSError:
pass
def waitpid_reap_other_children(pid):
done = False
status = None
while not done:
this_pid, status = os.waitpid(-1, 0)
done = this_pid == pid
return status
def stop_child_process(name, pid):
print("*** Shutting down %s (PID %d)..." % (name, pid))
def stop_child_process(name, pid, signo = signal.SIGTERM, time_limit = KILL_PROCESS_TIMEOUT):
info("Shutting down %s (PID %d)..." % (name, pid))
try:
os.kill(pid, signal.SIGHUP)
os.kill(pid, signo)
except OSError:
pass
signal.alarm(time_limit)
try:
try:
waitpid_reap_other_children(pid)
except OSError:
pass
except AlarmException:
warn("%s (PID %d) did not shut down in time. Forcing it to exit.")
try:
os.kill(pid, signal.SIGKILL)
except OSError:
pass
try:
waitpid_reap_other_children(pid)
except OSError:
pass
finally:
signal.alarm(0)
def run_command_killable(*argv):
global pid
filename = argv[0]
status = None
pid = os.spawnvp(os.P_NOWAIT, filename, argv)
signal.signal(signal.SIGINT, lambda signum, frame: stop_child_process(filename, pid))
signal.signal(signal.SIGTERM, lambda signum, frame: stop_child_process(filename, pid))
try:
status = waitpid_reap_other_children(pid)
except BaseException as s:
warn("An error occurred. Aborting.")
stop_child_process(filename, pid)
raise
if status != 0:
error("%s failed with exit code %d\n" % (filename, status))
sys.exit(1)
def kill_all_processes(time_limit):
info("Killing all processes...")
try:
os.kill(-1, signal.SIGTERM)
except OSError:
pass
signal.alarm(time_limit)
try:
# Wait until no more child processes exist.
done = False
while not done:
try:
this_pid, status = os.waitpid(-1, 0)
done = this_pid == pid
os.waitpid(-1, 0)
except OSError as e:
if e.errno == errno.EINTR:
sys.exit(2)
if e.errno == errno.ECHILD:
done = True
else:
raise
finally:
signal.signal(signal.SIGINT, signal.SIG_DFL)
signal.signal(signal.SIGTERM, signal.SIG_DFL)
if status != 0:
sys.stderr.write("*** %s failed with exit code %d\n" % (filename, status))
sys.exit(1)
# Run /etc/my_init.d/*
for name in listdir("/etc/my_init.d"):
filename = "/etc/my_init.d/" + name
if is_exe(filename):
print("*** Running %s..." % filename)
run_command_killable(filename)
# Run /etc/rc.local.
if is_exe("/etc/rc.local"):
print("*** Running /etc/rc.local...")
run_command_killable("/etc/rc.local")
# Start runit.
signal.signal(signal.SIGCHLD, reap_child)
print("*** Booting runit...")
pid = os.spawnl(os.P_NOWAIT, "/usr/bin/runsvdir", "/usr/bin/runsvdir", "-P", "/etc/service", "log: %s" % ('.' * 395))
print("*** Runit started as PID %d" % pid)
signal.signal(signal.SIGTERM, lambda signum, frame: stop_child_process("runit", pid))
# Wait for runit, and while waiting, reap any adopted orphans.
done = False
while not done:
try:
this_pid, status = os.waitpid(pid, 0)
done = True
except OSError as e:
if e.errno == errno.EINTR:
# Try again
except AlarmException:
warn("Not all processes have exited in time. Forcing them to exit.")
try:
os.kill(-1, signal.SIGKILL)
except OSError:
pass
finally:
signal.alarm(0)
def run_startup_files():
# Run /etc/my_init.d/*
for name in listdir("/etc/my_init.d"):
filename = "/etc/my_init.d/" + name
if is_exe(filename):
info("Running %s..." % filename)
run_command_killable(filename)
# Run /etc/rc.local.
if is_exe("/etc/rc.local"):
info("Running /etc/rc.local...")
run_command_killable("/etc/rc.local")
def start_runit():
info("Booting runit daemon...")
pid = os.spawnl(os.P_NOWAIT, "/usr/bin/runsvdir", "/usr/bin/runsvdir",
"-P", "/etc/service", "log: %s" % ('.' * 395))
info("Runit started as PID %d" % pid)
return pid
def wait_for_runit_or_interrupt(pid):
try:
status = waitpid_reap_other_children(pid)
return (True, status)
except KeyboardInterrupt:
return (False, None)
def shutdown_runit_services():
debug("Begin shutting down runit services...")
os.system("/usr/bin/sv down /etc/service/*")
def wait_for_runit_services():
debug("Waiting for runit services to exit...")
done = False
while not done:
done = os.system("/usr/bin/sv status /etc/service/* | grep -q '^run:'") != 0
if not done:
time.sleep(0.1)
def main(args):
if not args.skip_startup_files:
run_startup_files()
runit_exited = False
exit_code = None
if not args.skip_runit:
runit_pid = start_runit()
try:
if len(args.main_command) == 0:
runit_exited, exit_code = wait_for_runit_or_interrupt(runit_pid)
if runit_exited:
info("Runit exited with code %d" % exit_code)
else:
# The SIGCHLD handler probably caught it.
done = True
info("Running %s..." % " ".join(args.main_command))
pid = os.spawnvp(os.P_NOWAIT, args.main_command[0], args.main_command)
try:
exit_code = waitpid_reap_other_children(pid)
info("%s exited with exit code %d." % (args.main_command[0], exit_code))
except KeyboardInterrupt:
stop_child_process(args.main_command[0], pid)
except BaseException as s:
warn("An error occurred. Aborting.")
stop_child_process(args.main_command[0], pid)
raise
sys.exit(exit_code)
finally:
if not args.skip_runit:
shutdown_runit_services()
if not runit_exited:
stop_child_process("runit daemon", runit_pid)
wait_for_runit_services()
# Runit has exited. Reset signal handlers.
print("*** Runit exited with code %s. Waiting for all services to shut down..." % status)
signal.signal(signal.SIGCHLD, signal.SIG_DFL)
signal.signal(signal.SIGTERM, signal.SIG_DFL)
signal.siginterrupt(signal.SIGCHLD, False)
signal.siginterrupt(signal.SIGTERM, False)
# Parse options.
parser = argparse.ArgumentParser(description = 'Initialize the system.')
parser.add_argument('main_command', metavar = 'MAIN_COMMAND', type = str, nargs = '*',
help = 'The main command to run. (default: runit)')
parser.add_argument('--skip-startup-files', dest = 'skip_startup_files',
action = 'store_const', const = True, default = False,
help = 'Skip running /etc/my_init.d/* and /etc/rc.local')
parser.add_argument('--skip-runit', dest = 'skip_runit',
action = 'store_const', const = True, default = False,
help = 'Do not run runit services')
parser.add_argument('--no-kill-all-on-exit', dest = 'kill_all_on_exit',
action = 'store_const', const = False, default = True,
help = 'Don\'t kill all processes on the system upon exiting')
parser.add_argument('--quiet', dest = 'log_level',
action = 'store_const', const = LOG_LEVEL_WARN, default = LOG_LEVEL_INFO,
help = 'Only print warnings and errors')
args = parser.parse_args()
log_level = args.log_level
# Wait at most 5 seconds for services to shut down.
import time
def shutdown(signum = None, frame = None):
global status
if status is not None:
sys.exit(status)
signal.signal(signal.SIGALRM, shutdown)
signal.alarm(5)
done = False
while not done:
done = os.system("/usr/bin/sv status /etc/service/* | grep -q '^run:'") != 0
if not done:
time.sleep(0.5)
shutdown()
if args.skip_runit and len(args.main_command) == 0:
error("When --skip-runit is given, you must also pass a main command.")
sys.exit(1)
# Run main function.
signal.signal(signal.SIGTERM, lambda signum, frame: ignore_signals_and_raise_keyboard_interrupt('SIGTERM'))
signal.signal(signal.SIGINT, lambda signum, frame: ignore_signals_and_raise_keyboard_interrupt('SIGINT'))
signal.signal(signal.SIGALRM, lambda signum, frame: raise_alarm_exception())
try:
main(args)
except KeyboardInterrupt:
warn("Init system aborted.")
exit(2)
finally:
if args.kill_all_on_exit:
kill_all_processes(KILL_ALL_PROCESSES_TIMEOUT)