How I stumbled on unbootable OS thanks to the symlink
Today I was asked by my colleague to help him with an unbootable SLES11 which was just recently upgraded to SP3 … Well, first I thought that this problem will be simple to solve as common initrd problems are quite easy to break, but I’ve learned a lesson …
First I quickly checked /etc/sysconfig/kernel and triggered mkinitrd in rescue mode, but nope, system still refused to boot, how nice. By reading console output a bit more carefully, I’ve realized a strange thing; couple of ls: not found [No such file or directory] messages, REALLY?
So my next steps follow back to System rescue mode, and decision was clear, take a look inside of initrd. To my surprise, ls binary was really (kind of) missing. When I’ve read content of /bin inside initrd, I got ls -> /bin/ls instead of regular file lister, hmmmm awesome.
But how and why?
After mounting OS file systems, /bin/ls was present, but to my surprise there was shiny small symlink sitting in /usr/bin notably looking like the one from initrd, ah so here you are! Replaced symlink in /usr/bin with “real” /bin/ls, recreated initrd and voila, system is up and running again!
When system was back again, I had time to dig a bit deeper into mkinird logic. The thing is that mkinitrd on SLES11 consists of smaller scripts (modules) that are being executed in certain order. Some of these modules/scripts have special header labels and one of these labels, called %programs, can hold information about utilities that are needed for that particular script/module execution (e.g. ls, mount, sed, insmod …). Reading further one of the modules called boot-start.sh, I’ve noticed that some utilities in %programs label are listed with absolute path and other with relative paths, BINGO!
To cut the story short, if utility was listed without its full path, and it existed in /bin and /usr/bin at the same time, /usr/bin took precedence, not caring about fact that symlink to not existing file (ls in my case) can be rather useless in initrd image. :-/
First I thought that this is certainly a bug, but after thinking for a while, I’m not so sure anymore. Normally lusers (ordinary users) are not allowed to create symlinks inside /usr/bin as this is restricted area. As root you should really know your system and its pitfalls (with great power comes great responsibility). So take this short article as a warning that even such trivial operations as creating symlink can make your system unbootable and you unhappy.
But after all, every day you learn something new is a good day!!!