In case a container image is corrupted `docker inspect` might fail:
# docker inspect --format='{{.Id}}' "${SUPERVISOR_IMAGE}"
Error response from daemon: readlink /mnt/data/docker/overlay2: invalid argument
In that same state the `docker images` command still shows the images.
Since `docker inspect` returns an error SUPERVISOR_IMAGE_ID will be empty
and a simple `docker pull` will be attempted. That does not suffice to
recover from a corrupted container image.
Use `docker images` to get the image ids and make sure to delete all
image ids found by that command.
Also don't use RuntimeDirectory since it deletes the runtime directory
between the service start attempts which defeats the purpose.
* Simplify self healing capabilities of Supervisor service
Instead of relying on time based information on how long the container
has been running use a startup marker file to infer if the last startup
has been successful.
* Update buildroot-external/rootfs-overlay/usr/sbin/hassos-supervisor
Co-authored-by: Pascal Vizeli <pascal.vizeli@syshack.ch>
Co-authored-by: Pascal Vizeli <pascal.vizeli@syshack.ch>
* automatically fsck to repair partitions
* add fsck.fat so rpi boot partition can be repaired
* Use Wants= instead of Requires=
Co-authored-by: Pascal Vizeli <pascal.vizeli@syshack.ch>
* add dosfstools to all images
* run hassos-data and hassos-expand after fsck
Co-authored-by: Pascal Vizeli <pascal.vizeli@syshack.ch>
The Docker socket path is /run/docker.sock. Also only one path can be
used per property. This fixes the supervisor service, which currently
refuses to start due to missing Docker socket.
The new readline utilty used by the CLI add-on requires the size of the
terminal to be set. Use the resize command to initialize terminal size
on login if we are running on a serial terminal.
The hassos-expand script calls sfdisk to find free disk space. It seems
that today it considers the space before the first partition as free:
$ sudo sfdisk -Fq /dev/sdi
Start End Sectors Size
2048 16383 14336 7M
This causes the script to always resize. It seems not to cause harm to
the partition table (it does not resize really). However, the call to
partx seems to confuse systemd and kill the mnt-data.mount process
(presumably because udev causes remove/add events for the by-label
device units).
Consider everything below 8MiB to not be worthy of a size change. This
avoids missdetection and resize attempts where there is no need.
dhclient and systemd-journald will be running during shutdown and are
only killed in the final shutdown fase. Unmounting the directories
they use will fail. Use lazy unmouting to fix this.
On systems where ACPI support is present as inidcated by the presence of
/proc/acpi (e.g. on OVA compatible hypervisors), we want to properly
shut down the system when the power button is pressed (or the hypervisor
simulates this kind of event to the guest machine that executes hassos).
This changeset provides the following basic infrastructure for this
feature to work as expected:
* a systemd service to start acpid, if ACPI support can be assumed
* an acpid configuration directory
* a trivial shutdown script to invoke when a PWR event is registered
Working:
* Ethernet
* Resize of Data
* RAUC boot marking/fetching
* CMD Line into HASSOS and Linux
Partially working:
* USB (requires 1+ devices in at boot. Seems to be a kernel/dt issue.)
Untested:
* RAUC Update
* HDMI
Not working:
* Homeassistant
** We see:
hassio > ha info
The HTTP request failed with the error: Get http://hassio/homeassistant/info: dial tcp 172.30.32.2:80: getsockopt: connection refused