Odi's astoundingly incomplete notes
New entries | Codeipset's hashsize and maxelem parameters
When defining a Linux hash ipset the parameters hashsize and maxelem must be chosen.
maxelem is easy: this limits how many entries the ipset can have.
hashsize however is a tuning parameter. It defines how many hash buckets are allocated for the hashtable. This is the amount of memory that you are willing to sacrifice. It has a very coarse granularity and accepts only values that are equal to 2^n where n is 1..32.
Hashtables are most efficient (buckets mostly contain only a single key, eliminating the search within a bucket) when only 3/4 of their buckets are actually used (1/4 is free). But for large ipsets this is not practical as it would waste a lot of memory. For example for an ipset with 100'000 entries the hashsize should be at least 133'333. The next larger legal value of hashsize is 262'144 which is very wasteful (but fast).
So for such large hashtables we can't really afford to avoid the bucket search. Instead we try to find a balance between the size of a bucket and the number of buckets. If we put 8 entries inside a bucket on average then we get 12'500 buckets. The next legal value for hashsize is 16'384, which gets us 6 entries in average in reality. This should yield acceptable performance vs. small enough space.
maxelem is easy: this limits how many entries the ipset can have.
hashsize however is a tuning parameter. It defines how many hash buckets are allocated for the hashtable. This is the amount of memory that you are willing to sacrifice. It has a very coarse granularity and accepts only values that are equal to 2^n where n is 1..32.
Hashtables are most efficient (buckets mostly contain only a single key, eliminating the search within a bucket) when only 3/4 of their buckets are actually used (1/4 is free). But for large ipsets this is not practical as it would waste a lot of memory. For example for an ipset with 100'000 entries the hashsize should be at least 133'333. The next larger legal value of hashsize is 262'144 which is very wasteful (but fast).
So for such large hashtables we can't really afford to avoid the bucket search. Instead we try to find a balance between the size of a bucket and the number of buckets. If we put 8 entries inside a bucket on average then we get 12'500 buckets. The next legal value for hashsize is 16'384, which gets us 6 entries in average in reality. This should yield acceptable performance vs. small enough space.
Add comment