Upgrading unipi-kernel-modules-dkms to 1.54 causes infinite reboot loop



  • Traced the issue to file /etc/initramfs-tools/scripts/init-bottom/setbootconfig.sh

    + grep -q 'Hardware[[:blank:]]*:[[:blank:]]*BCM' /proc/cpuinfo
    + unset IS_REAL_SYSTEM
    + unset DO_MOUNT
    + '['  '=' --noreboot -o -n  ]
    + IS_REAL_SYSTEM=1
    + DO_MOUNT=1
    + MNTDIR=/tmp/boot
    + '[' 1 '=' 1 ]
    + '[' -d /sys/firmware/devicetree/base/soc/i2c@7e804000/mcp7941x@6f ]
    + RTC=1
    + '[' -d /sys/firmware/devicetree/base/soc/spi@7e204000/neuronspi@0 ]
    + NEURONDRV=1
    + '[' -d /sys/firmware/devicetree/base/soc/i2c@7e804000/24c01@57 ]
    + EEPROM_1=1
    + '[' -d /sys/firmware/devicetree/base/soc/i2c@7e804000/24c02@50 ]
    + EEPROM_2=1
    + '[' -n 1 -a -n 1 ]
    + EE_CONFLICT=1
    + grep -q okay /sys/firmware/devicetree/base/soc/spi@7e204000/status
    + SPI=1
    + grep -q okay /sys/firmware/devicetree/base/soc/i2c@7e804000/status
    + I2C=1
    + '[' -n 1 -a -n 1 -a -z 1 ]
    + '[' 1 '=' 1 ]
    + mkdir -p /tmp/boot
    + mount /dev/mmcblk0p1 /tmp/boot
    + echo 'dtparam=i2c_arm=on'
    + echo 'dtoverlay=i2c-rtc,mcp7941x'
    + '['  '=' 1 ]
    + echo 'dtoverlay=neuron-spi-new'
    + rm -f /tmp/boot/config-unipi.inc
    + sed -i 's/include config-unipi\.inc/include config_unipi.inc/' /tmp/boot/config.txt
    + INCLUDE='include config_unipi.inc'
    + grep -q -e '^[[:blank:]]*include config_unipi.inc' /tmp/boot/config.txt
    + '[' 1 '=' 1 ]
    + umount /tmp/boot
    + rmdir /tmp/boot
    + sync
    + echo REBOOT
    REBOOT
    + exit 0
    done.
    

    There seem to exist both /sys/firmware/devicetree/base/soc/i2c@7e804000/24c02@50 and /sys/firmware/devicetree/base/soc/i2c@7e804000/24c01@57 folders in my systems, setting the EE_CONFLICT flag. If I comment out the line that sets the flag and reconfigure the package, everything seems to be working OK.

    + grep -q 'Hardware[[:blank:]]*:[[:blank:]]*BCM' /proc/cpuinfo
    + unset IS_REAL_SYSTEM
    + unset DO_MOUNT
    + '['  '=' --noreboot -o -n  ]
    + IS_REAL_SYSTEM=1
    + DO_MOUNT=1
    + MNTDIR=/tmp/boot
    + '[' 1 '=' 1 ]
    + '[' -d /sys/firmware/devicetree/base/soc/i2c@7e804000/mcp7941x@6f ]
    + RTC=1
    + '[' -d /sys/firmware/devicetree/base/soc/spi@7e204000/neuronspi@0 ]
    + NEURONDRV=1
    + '[' -d /sys/firmware/devicetree/base/soc/i2c@7e804000/24c01@57 ]
    + EEPROM_1=1
    + '[' -d /sys/firmware/devicetree/base/soc/i2c@7e804000/24c02@50 ]
    + EEPROM_2=1
    + grep -q okay /sys/firmware/devicetree/base/soc/spi@7e204000/status
    + SPI=1
    + grep -q okay /sys/firmware/devicetree/base/soc/i2c@7e804000/status
    + I2C=1
    + '[' -n 1 -a -n 1 -a -z  ]
    + echo 24c02 0x50
    sh: write error: Device or resource busy
    + '[' -f /sys/bus/i2c/devices/1-0050/eeprom ]
    + echo 0x50
    sh: write error: No such file or directory
    + echo 24c01 0x57
    sh: write error: Device or resource busy
    + '[' -f /sys/bus/i2c/devices/1-0057/eeprom ]
    + IS_NEURON=1
    + '[' '(' -n 1 -o -n  ')' -a -n 1 -a -n 1 ]
    + '[' -n  ]
    + '[' -n 1 ]
    + echo OK
    OK
    + exit 0
    done.
    

    Is there a particular reason for this "conflict" test? Also, rebooting after applying the configs seems a bit extreme, since if anything goes wrong and system goes into an infinite reboot loop, it is a pain to debug and fix...


  • administrators

    Hello @Achilleas-Kotsis,

    thank you for the detail description of the issue.

    It seems that there is a conflict between old approach (enabling both eeprom-s in device tree through overlays) and the new one (enabling it in the initrd).

    Can you please post content of files /boot/config_unipi.inc and /boot/config.txt (both locat in FAT32 boot partition).

    The main goal of the dynamic EEPROM management is keeping only one EEPROM enabled - depending on the platform - instead of use hard-coding this into the device tree.



  • root@officeplc:~# cat /boot/config.txt | grep -v ^# | grep -v ^$
    include config_initrd.inc
    include config_unipi.inc
    disable_overscan=1
    hdmi_force_hotplug=1
    dtparam=audio=on
    hdmi_ignore_cec_init=1
    gpu_mem=16
    hdmi_audio_config=0x200 
    root@officeplc:~# cat /boot/config_initrd.inc 
    [pi1]
    initramfs initrd.img-5.4.72+ followkernel
    [pi2]
    initramfs initrd.img-5.4.72-v7+ followkernel
    [pi3]
    initramfs initrd.img-5.4.72-v7+ followkernel
    [pi4]
    initramfs initrd.img-5.4.72-v7l+ followkernel
    [all]
    root@officeplc:~# cat /boot/config_unipi.inc 
    dtparam=i2c_arm=on
    dtoverlay=neuronee
    dtoverlay=i2c-rtc,mcp7941x
    dtoverlay=unipiee
    dtoverlay=neuron-spi-new
    root@officeplc:~#
    

    I guess some of those overlays should be disabled and are leftover...


  • administrators

    Hello @Achilleas-Kotsis,

    you are right - the neuronee nor unipiee overlays should not be enabled. Can you please provide info about your image that you are upgrading? We will try to emulate the issue.



  • OK, it seems that the issue was some weird corruption on the filesystem that was impossible to detect even with windows disk scan or linux fsck on the boot partition.
    Your initramfs script was replacing the file config_unipi.inc with the new version not containing unipiee and neuronee, but the change was not visible and the file was remaining as is.

    Even more wierdly, I changed the file by hand using vi, rebooted, the file seemed changed, but the system STILL loaded unipiee and neuronee overlays on boot, causing still an infinite reboot.

    I removed the file completely and recreated it, and now the system seems to be working properly. Thanks for your help on pinpointing the issue.


  • administrators

    Hello @Achilleas-Kotsis,

    I have tried to upgrade the system from a few older images (also with DKMS drivers) and this kind of behaviour did not occur. The issue seems to be coupled with the filesystem write operation and the following sync (should be invoked by umount command in the script).

    Moreover, the .inc file should be updated formerly within the upgrade process (as the step of unipi-common package postinst script).

    Can you retry the whole process with a new uSD card? This kind of problems sometimes occurs on corrupted cards as the write operation is not fully synced in fact.


Log in to reply