Docker, kernel namespaces & volumes
Hey guys!
So it's been a while but I've been using this as a notepad for my ideas that are worthy and thing I struggle with in general and also time is a factor. Anyhow, I've been playing a lot with Docker lately to set up things in a new server and recently Docker 1.10 was released with awesome new security features, most notably kernel namespaces.
For those who aren't aware of kernel namespaces, it isn't a new feature. It's been present in the kernel for a while now and basically (and put very simply) it lets you run child programs in their own namespace in a completely different subtree which isolates them from the rest of the system (with their own PID mapped to different UID/GID). The advantages of this is that your child program is running in some kind of sandbox and can't modify your system because he has no rights to do so.
In the case of Docker before 1.10 it was a beta feature but it is now officially built-in granting you boot the Docker daemon with the --userns-remap
. Bear in mind that when you enable this flag you'll have to redownload your images and recreate your containers. When enabled, what that does by default is: it remaps everything run inside containers with subuid/subgid in the 100000 range. For example say you have something running as root in the container, in the view of the container the program runs with the 0 uid/gid but from the host perspective it runs under the 100000 uid/gid. Same for user, say you create a user inside your container which automatically gets its uid/gid as 1000 then on the host it'll be mapped as 101000 and so on and so forth.
Now things get tricky when you have to mount volumes from the host into the container and need write access to it. I was a bit puzzled with that when I started experimenting and thought that I am not the only one with this problem, fortunately enough found this post: http://stackoverflow.com/questions/35291520/docker-and-userns-remap-how-to-manage-volume-permissions-to-share-data-betwee last week which was some help.
In the said post, amartynov manually remaps namespaces used by Docker in the 500000 range which is fine but not mandatory you can use the default Docker 100000 range. If you had already created users on your host and mounting some of their directories as volumes in your containers you'll have to change their UID/GID and manually create your user in your container to trick the system. For example, if you originally had a user with the 1008 UID/GID and a matching user in your container now, in the host the container user will be mapped to the UID/GID 101008 (but the container will perceive it as 1008) which means you'll have to modify your host user to use the 101008 UID/GID with the following commands:
usermod -u 101008 foo
groupmod -g 101008 foo
Then you'll have to properly chown
the directory you're mounting as a volume in the container on the host and then your container should now have write access to the mounted volume while retaining the security advantage of the user namespace features.
That's it for now. It was a quick one and if you have notes or stuffs you'd like to add/ask you're welcome to it in the comments.
Written by Félix Bellanger
Related protips
1 Response
Thanks for the post! Once you manually re-arrange the host user to the uid/gid from mapped uid/gid (in this case 101008), how will this work if I would want to run multiple containers on the host?