* Add systemd-journal-remote to the image
This allows to access journald's log from within Supervisor and expose
more system logs to users.
* Allow to access systemd-journal-gatewayd from Supervisor
Create a systemd-journal-gatewayd.socket service using a Unix socket and
bind mount it into the Supervisor container. This allows to query
systemd-journald from Supervisor directly.
* Avoid duplicate log entries
So far the hassos-supervisor.service starts the hassos-supervisor script
which in turn attaches to the Supervisor container. This causes stdout
and stderr to be forwarded to the service unit, which in turn logs it in
the journal.
However, Docker too logs all stdout/stderr to the journal through the
systemd-journald log driver.
Do not attach to the Supervisor container to avoid logging the
Supervisor twice.
Note that this no longer forwards signals to the container. However, the
hassos-supervisor.service uses the ExecStop= setting to make sure the
container gets gracefully stopped.
* Use image and container name as syslog identifier
By default Docker users the container id as syslog identifier. This
leads to log messages which cannot easily be attributed to a particular
container (since the container id is a random hex string).
Use the image and container name as syslog identifier.
Note that the Docker journald log driver still stores the container id
as a separate field (CONTAINER_ID), in case the particular instance need
to be tracked.
* Recreate Supervisor container on OS upgrade/downgrade
When the operating system gets upgraded or downgraded the Supervisor
start script might start the Supervisor slightly differently (e.g. with
RT scheduling support). If the container has already been created, a OS
upgrade or downgrade won't recreate the Supervisor container.
* Move startup script version file to /mnt/data
* Add --cpu-rt-runtime to allow Docker allocate real-time CPU time (#1235)
* Enable Supervisor's CPU bandwith allocation feature (#1235)
Since we have CONFIG_RT_GROUP_SCHED enabled in the Home Assistant OS
kernel the Supervisor needs to enable CPU bandwith allocation for
Add-Ons which need real-time scheduling. Set the appropriate environment
variable.
In case a container image is corrupted `docker inspect` might fail:
# docker inspect --format='{{.Id}}' "${SUPERVISOR_IMAGE}"
Error response from daemon: readlink /mnt/data/docker/overlay2: invalid argument
In that same state the `docker images` command still shows the images.
Since `docker inspect` returns an error SUPERVISOR_IMAGE_ID will be empty
and a simple `docker pull` will be attempted. That does not suffice to
recover from a corrupted container image.
Use `docker images` to get the image ids and make sure to delete all
image ids found by that command.
Also don't use RuntimeDirectory since it deletes the runtime directory
between the service start attempts which defeats the purpose.
* Simplify self healing capabilities of Supervisor service
Instead of relying on time based information on how long the container
has been running use a startup marker file to infer if the last startup
has been successful.
* Update buildroot-external/rootfs-overlay/usr/sbin/hassos-supervisor
Co-authored-by: Pascal Vizeli <pascal.vizeli@syshack.ch>
Co-authored-by: Pascal Vizeli <pascal.vizeli@syshack.ch>