Using bitmasks to store settings
We often need to store several booleans to determine a user's settings. Bitmasks are a good economic way to do so. The idea is to use a single integer where each bit represents one such boolean. This is better than saving each individual settings in a different database column.
Choose bit positions for each settings. Lets say we want to store notification settings.
NEWSLETTER = 0
COMMENTS = 1
FOLLOWER = 2
Now let's turn off all settings:
user.settings = 0
Now let's turn on comment notifications:
user.settings |= (1 << COMMENTS)
Let's see how it looks like in binary:
user.settings.to_s(2)
>>> "10"
The bit at position 1 (COMMENTS
) was set to 1 while the rest hasn't changed.
The operation 1 << N
places (shifts) the value 1 to the position N so that e.g.:
(1 << 4).to_s(2)
>>> "10000"
Now |
and &
are the usual OR and AND operators but applied at a bit level so that :
(1 << 4 | 1 << 2).to_s(2)
>>> "10100"
We see that a |= 1 << N
turns ON the N-positioned bit in the integer a
.
The operator ~
inverts all the bits in a variable so that:
(~0b1010).to_s(2)
>>> "-110"
Which is all bits set to 1 except bit 1 and bit 3 if we talk about signed integers.
To turn a bit OFF, simply do
user.settings &= ~(1 << COMMENTS)
since ~(1 << COMMENTS)
will have all bits ON except the one at position COMMENTS
.
To check if a specific settings is ON, simply check
user.settings & (1 << COMMENTS) > 0
Written by Emmanuel Turlay
Related protips
14 Responses
This is great. A potential downfall is that you kind of lose the ability to query and report on those settings. "Show me users that have disabled followers" becomes a bit (ha) more difficult.
That's a fair point.
If I'm not mistaken, all flavours of SQL support bitwise operations don't they? Hence one could query with SELECT * FROM users WHERE settings & (1 << 3) > 0
.
Isn't that right?
@neutralino1, that is indeed THE solution for this. I love bitmasks too, they certainly make some tasks a lot easier (and save a lot of extra columns in your database).
BUT, you always got to be careful when using them, not just throw everything in there :P
Indeed, using bitmasks requires some serious testing.
And you sure can't use it for everything. In my case, I use it for a stupid collection of checkboxes.
This would have been the only way dev's stored things years back, when memory was an issue. Wouldn't it be nice to see how fast modern code could be if time was spent to speed it up.
@encodes Indeed.
I actually learned this technique while coding for the ATLAS experiment at CERN which contains thousands of electronic chips where both physical and logical space are limited resources.
@neutralino1 Yea, its definitely an underused technique. I used a similar technique on a micro-controller for a burglar alarm. With 8 zones, we need just 3 bytes to control whether they are enabled, alarmed and triggered. Then its just a matter of setting up alarms/switches and LED's to trigger these points.
Bitwise operators are just not taught in main stream anymore.
@mikeymike I think you will appreciate this one.
@encodes that's pretty neat!
bitmaps in ruby? you gotta be kidding
@sheerun Why?
- If you're using bitmaps in-memory you need to allocate more memory for objects to work with them than the save you space.
- If you're using bitmaps in database you cannot work with this data in any sensible way (no indexes, no clean queries), so you use bitfield just as crippled storage format. Moreover each database has it's own, performant and indexable structures to store such data.
Not mentioning I'd never scarify system and code readability to save few bytes of space per record. Sounds like premature optimisation, eh?
In Ruby you get neither of performance, code readability, lower storage/memory space, data normalisation, good query interface. There is literally no advantage.
@sheerun I have a table dedicated to users settings with dozens of booleans columns and as many rows as users. That freaks me out. I much rather like a single integer field on the users table.
I never need to query on them, I only need to check a given record's settings.
Now to be clear, I never said people should use this technique blindly. Everyone should be sensible to how it fits there requirements. Just like everything else.
Also, I used ruby here to illustrate the point but the post was mostly intended to showcase the technique rather than the language used to apply it.
Does endianness become a problem with this kind of approach?
When you have 65 roles+ ( 64 bits / 8 = 8 byte ), it will be wrong way. Because MySQL only have 8bytes ( BIGINT data type). How to fix it ?