Odi's astoundingly incomplete notes
New entriesCode
back | nextipset's hashsize and maxelem parameters
When defining a Linux hash ipset the parameters hashsize and maxelem must be chosen.
maxelem is easy: this limits how many entries the ipset can have.
hashsize however is a tuning parameter. It defines how many hash buckets are allocated for the hashtable. This is the amount of memory that you are willing to sacrifice. It has a very coarse granularity and accepts only values that are equal to 2^n where n is 1..32.
Hashtables are most efficient (buckets mostly contain only a single key, eliminating the search within a bucket) when only 3/4 of their buckets are actually used (1/4 is free). But for large ipsets this is not practical as it would waste a lot of memory. For example for an ipset with 100'000 entries the hashsize should be at least 133'333. The next larger legal value of hashsize is 262'144 which is very wasteful (but fast).
So for such large hashtables we can't really afford to avoid the bucket search. Instead we try to find a balance between the size of a bucket and the number of buckets. If we put 8 entries inside a bucket on average then we get 12'500 buckets. The next legal value for hashsize is 16'384, which gets us 6 entries in average in reality. This should yield acceptable performance vs. small enough space.
maxelem is easy: this limits how many entries the ipset can have.
hashsize however is a tuning parameter. It defines how many hash buckets are allocated for the hashtable. This is the amount of memory that you are willing to sacrifice. It has a very coarse granularity and accepts only values that are equal to 2^n where n is 1..32.
Hashtables are most efficient (buckets mostly contain only a single key, eliminating the search within a bucket) when only 3/4 of their buckets are actually used (1/4 is free). But for large ipsets this is not practical as it would waste a lot of memory. For example for an ipset with 100'000 entries the hashsize should be at least 133'333. The next larger legal value of hashsize is 262'144 which is very wasteful (but fast).
So for such large hashtables we can't really afford to avoid the bucket search. Instead we try to find a balance between the size of a bucket and the number of buckets. If we put 8 entries inside a bucket on average then we get 12'500 buckets. The next legal value for hashsize is 16'384, which gets us 6 entries in average in reality. This should yield acceptable performance vs. small enough space.
Add comment
Java and its use of mmap
These are the syscalls caused by Java's mapped byte buffers:
FileChannel fc = FileChannel.open(f.toPath(), StandardOpenOption.READ, StandardOpenOption.WRITE); // mmap(NULL, 2147483647, PROT_READ|PROT_WRITE, MAP_SHARED, 4, 0) MappedByteBuffer buf = fc.map(MapMode.READ_WRITE, 0, Integer.MAX_VALUE); // madvise(0x7f4294000000, 2147483647, MADV_WILLNEED) = 0 buf.load();When the buffer is garbage collected the
munmap
call happens.How to migrate SonarQube to Postgresql
Perform the migration on an existing SonarQube installation. You can not do it at the same time as upgrading to a newer SonarQube version!
1. Create an emtpy Postgresql DB (no password is used here, depending on settings in
4. Shut down the SonarQube instance again for migration.
5. Delete the
4. Run the mysql-migrator utility.
5. Startup the SonarQube instance again.
6. If you want to update to a newer SonarQube version then do that now.
1. Create an emtpy Postgresql DB (no password is used here, depending on settings in
pg_hba.conf
):
psql -U postgres create user sonar; create database sonarqube owner sonar;2. Change the DB connection of the existing SonarQube installation in
sonar.properties
:
sonar.jdbc.username=sonar #sonar.jdbc.password= sonar.jdbc.url=jdbc:postgresql://localhost/sonarqube?currentSchema=public3. Start up the SonarQube instance so that it creates the DB schema in Postgresql.
4. Shut down the SonarQube instance again for migration.
5. Delete the
sonar/data/es6/nodes
folder.4. Run the mysql-migrator utility.
5. Startup the SonarQube instance again.
6. If you want to update to a newer SonarQube version then do that now.
Java and its use of epoll
In case you wonder how Java NIO uses epoll under Linux:
- The Selector allocates an epoll file descriptor and two FDs (a pipe) for timeout/wakeup. Failing to close the Selector will leak those.
- It uses epoll as level-triggered interface (no EPOLLET)
- It is important to remove a selected key from the Selector's selectedKeys Set. Only then the next select() call will reset its readyOps.
JDK-1.8 simplifies atomic maximizer
With Java 8 the atomic primitives have gained a very useful function: getAndUpdate(). It takes a lamda function to atomically update the value. This simplifies previous complicated code that used compareAndSet in a loop into a one-liner.
As an example look at a piece of code that is used to keep track of a maximum value.
As an example look at a piece of code that is used to keep track of a maximum value.
private AtomicInteger max = new AtomicInteger(); public void oldSample(int v) { int old; do { old = c.get(); } while (!c.compareAndSet(old, Math.max(old, v))); } public void newSample(int v) { max.getAndUpdate(old -> Math.max(old, v)); }
Set your HTTP cache headers correctly
I see sites often disable caching of resources completely with really bad headers like:
It makes a lot more sense to let the client cache and tell it to check if the resource has been modified in the mean time. The easiest way to do that is to pass the Last-Modified header together with:
Maybe this practice comes from bad defaults in Apache. I have not seen any default Apache config that sets sensible Cache-Control. Therefore no header is sent and browsers cache such responses forever, not even clicking the Reload button will fetch it again. This of course makes developers take the simple but radical option to disable caching.
A much better default for Apache is:
Cache-Control: no-store, no-cache, must-revalidate Expires: Wed, 4 Jun 1980 06:02:09 GMT Pragma: nocache
It makes a lot more sense to let the client cache and tell it to check if the resource has been modified in the mean time. The easiest way to do that is to pass the Last-Modified header together with:
Cache-Control: max-age=0, must-revalidateThis will enable caching in the browser and the browser will request the resource with the
If-Modified-Since
header. The server will respond with 304 Not Modified if the resource's last-modified date is still the same, saving the transfer. If you need more control over the content of the resource and a last-modified date is not enough or can not easily be given, you can set the ETag header. ETag is a hash or version number of the content and changes as the resource's content changes. But careful: ETag may change with the Content-Encoding (compression). Carefully test if it behaves correctly with your gateway (reverse proxy).Maybe this practice comes from bad defaults in Apache. I have not seen any default Apache config that sets sensible Cache-Control. Therefore no header is sent and browsers cache such responses forever, not even clicking the Reload button will fetch it again. This of course makes developers take the simple but radical option to disable caching.
A much better default for Apache is:
Header set Cache-Control "max-age=0, must-revalidate"
On Gentoo sshd is killed after udev is triggered
After running some updates I noticed that sshd (including active sessions) were somehow killed sometimes. After much debugging I found the reason: udev and cgroups. It looks like udev can send kill signals to all members of its cgroup if it thinks that it's a systemd system. But on OpenRC systems that just does a lot of harm.
That udev triggering happens for example during:
Note the absent udev directory under
This also explains why the problem is fixed by a reboot.
I have filed a bug against OpenRC.
That udev triggering happens for example during:
- grub-install
- startup of qemu with kvm
/etc/init.d/udev -D restart
. The culplrit being the -D
flag. The flag causes cgroups to be not set. So udev ends up in the main cgroup!Note the absent udev directory under
/sys/fs/cgroup/openrc
This also explains why the problem is fixed by a reboot.
I have filed a bug against OpenRC.
Relaying UDP broadcasts
iptables -t mangle -A INPUT -i eth0 -d 255.255.255.255 -j TEE --gateway 10.1.1.255The above
iptables
rule copies broadcast traffic received on the eth0
network interface to another network interface (the one whose broadcast address is 10.1.1.255). Note that this is one-way only. We can't add a second rule for the other direction without creating an infinite packet loop. We need to play tricks with the TTL for that!Incoming broadcast packets typically have a TTL of 64 or 128. TEE uses the kernel function
nf_dup_ipv4()
to copy the packet, which already decrements the TTL if the rule is in INPUT or PREROUTING. Note that a packet with TTL=0 will still be accepted by the destination, but will no longer be routed. But TEE itself does not check for TTL=0 and happily copies such packets. So we need to prevent that too, since what we do is effectively routing.The improved rule adds TTL sanity check:
iptables -t mangle -A INPUT -i eth0 -d 255.255.255.255 -m ttl --ttl-gt 0 -j TEE --gateway 10.1.1.255If we want to add a rule for the other direction as well...
iptables -t mangle -A INPUT -i eth1 -d 255.255.255.255 -m ttl --ttl-gt 0 -j TEE --gateway 10.1.0.255then we easily create a packet loop, since the copy of a packet on
eth0
will now also match the rule on eth1
. To prevent that we need to ensure that the copied packet has TTL=0. We can do that by simply setting the TTL=1 of all incoming broadcasts before passing them to TEE. Then no more loops should occur. The complete rule set for merging a broadcast domain across networks is then:iptables -t mangle -A INPUT -i eth0 -d 255.255.255.255 -m ttl --ttl-gt 0 -j TTL --ttl-set 1 iptables -t mangle -A INPUT -i eth1 -d 255.255.255.255 -m ttl --ttl-gt 0 -j TTL --ttl-set 1 iptables -t mangle -A INPUT -i eth0 -d 255.255.255.255 -m ttl --ttl-gt 0 -j TEE --gateway 10.1.1.255 iptables -t mangle -A INPUT -i eth1 -d 255.255.255.255 -m ttl --ttl-gt 0 -j TEE --gateway 10.1.0.255Make sure to monitor your broadcast traffic to detect any misconfiguration after that change:
tcpdump -vnpi eth0 ip broadcast
Slight problem with this: It also rebroadcasts DHCP requests / replies.
I only have one DHCP instance, and the correct exchange almost always wins... but sometimes it doesn't.
I only have one DHCP instance, and the correct exchange almost always wins... but sometimes it doesn't.
Thanks, this was almost exactly what I needed! One thing I didn't quite like was the ttl modifications - it feels a little kludgy. But what I came up with works well, and feels a little better to me.
While we'll see packets we relay onto another subnet, the source address won't match the subnet on that interface. We can then restrict our relaying to packets whose source address matches the interface where we received it:
iptables -t mangle -A INPUT -i eth0 -s 10.1.0.0/24 -d 255.255.255.255 -j TEE --gateway 10.1.1.255
iptables -t mangle -A INPUT -i eth1 -s 10.1.1.0/24 -d 255.255.255.255 -j TEE --gateway 10.1.0.255
While we'll see packets we relay onto another subnet, the source address won't match the subnet on that interface. We can then restrict our relaying to packets whose source address matches the interface where we received it:
iptables -t mangle -A INPUT -i eth0 -s 10.1.0.0/24 -d 255.255.255.255 -j TEE --gateway 10.1.1.255
iptables -t mangle -A INPUT -i eth1 -s 10.1.1.0/24 -d 255.255.255.255 -j TEE --gateway 10.1.0.255
preventing CUPS password prompt
Add Unix groups to CUPS:
Add your user to that group:
/etc/cups/cups-files:
SystemGroup lpadmin rootSet permissions to access the local secret:
chgrp -R lpadmin /run/cups/certsThis directory contains a secret that is read by CUPS utilities like
cupsenable
. They pass that secret in HTTP Authorization headers to the local HTTP socket when sending command.Add your user to that group:
usermod -a -G lpadmin myuser
Be bloody careful with CNAME records
Be careful when doing something stupid!
CNAME records are useful. Especially together with external hosting services.
You run your domain and control DNS yourself but you host a website on an external service. They manage the IP of that site and give you a name that may look like
Never create a CNAME record for the zone name!
Also when caches start picking up the SOA information of the wrong zone, they apply the TTL values of that zone. So getting control of your zone back may not be easy and is not under your control anymore.
Also any secondary DNS server will pickup the zone redirect and may completely stop updating the zone from your authorative server. You need to manually fix that on the secondary DNS!
back
|
next
CNAME records are useful. Especially together with external hosting services.
You run your domain and control DNS yourself but you host a website on an external service. They manage the IP of that site and give you a name that may look like
examplesite383.hostingprovider.biz
and already points to the correct IP. You want to map it into your DNS zone with a nice name like www.fancyproduct.com
so creating a CNAME that points tha name www
to your hosting provider's name is a practical way to go. You also want to do the same without the www prefix, but mind you! Creating a CNAME for fancyproduct.com
would redirect the complete zone to a different one!Never create a CNAME record for the zone name!
Also when caches start picking up the SOA information of the wrong zone, they apply the TTL values of that zone. So getting control of your zone back may not be easy and is not under your control anymore.
Also any secondary DNS server will pickup the zone redirect and may completely stop updating the zone from your authorative server. You need to manually fix that on the secondary DNS!