Content

Unplanned outage on June 03

On June 03, 2020, at around 16:15 local time (14:15 UTC), disk accesses on ftp.fau.de became extremely slow. Less than an hour later, any attempts to access the disks with the ftp-data on it failed. Investigation revealed that the big RAID controller that manages all the external disk enclosures for the data had stopped responding completely.
While the failed controller could temporarily be brought back by a powercycle at around 18:00 local time (16:00 UTC), it failed again within 10 minutes of booting the machine.
Unfortunately, there was no compatible replacement for the failed controller onsite. A replacement has been ordered and shipped, but has not arrived yet. As all parcel services are severely overloaded due to the Corona crisis, it is currently unclear when it will arrive.

We were able to bring ftp.fau.de partially back after noon on June 04: It seems the broken controller does not crash as long as it does not get too much load. We have therefore had to disable automatic updates of all mirrors for now. They will mostly remain at the version they had at around 2020-06-03 14:30 UTC. We have however updated a few select mirrors manually.

The controller also is significantly slower than normal, even if it has a significantly lower workload than usual. This is mostly because we have disabled any write caching on it, which indirectly automatically slows the throughput it can achieve to a crawl. While many accesses can be handled by the big SSD that serves as a cache (and is working perfectly fine), in those cases where a fetch from the spinning hard discs is needed because data is not in the cache, these will be significantly slower than usual.

We are sorry for the inconvenience and trying our best to return to full service ASAP.

We will update this article as needed.

Update 1 @2020-06-06 08:30: Parcel tracking now says that our delivery has arrived in our city and will be delivered to us on Monday, so we expect to be back in business by Monday evening.

Update 2 @2020-06-07 08:00: While we are not back to our usual sync schedules yet, all mirrors should be updated at least once a day.

Update 3 @2020-06-08 14:00: The replacement controller has arrived.

Update 4 @2020-06-08 22:00: The replacement-controller is working fine. All mirrors are current again, and normal update intervals have been resumed.
While we were working on the machine anyways, we also upgraded the main memory from 64 to 128 GB.

Mirrors generating most traffic in 2019

The total traffic of ftp.fau.de increased from 3.22 PB in 2018 to 3.52 PB in 2019, an unusually small increase of only 9%.

Rank Mirror Traffic 2019 in TB Rank/Traffic 2018 (for comparison)
1 qtproject (Qt Toolkit) 511 2 / 445
2 kiwix (offline Wikipedia) 458 1 / 476
3 mint/iso 260 5 / 255
4 lineageos (Free and Open-Source Android distribution) 246 3 / 327
5 eclipse 162 4 / 266
6 opensuse 138 7 / 119
7 fedora 136 6 / 137
8 cdn.media.ccc.de (Talk recordings from CCC and related conferences) 120 – / 94
9 centos 117 9 / 109
10 osmc (Open Source Media Center) 112 8 / 113

The percentage of IPv6 traffic increased a tiny bit again, to 25% (0.87 PB), after 22% in 2018, 22% in 2017, 14% in 2016 and 11% in 2015.

There is little change in the Top 10 overall, with traffic volumes staying pretty much the same, only some mirrors swapped their places in the table.

Our mirror of cdn.media.ccc.de managed to reach the Traffic Top 10 again, but it was a close call. This mirror makes most of its yearly traffic at the end of December / beginning of January, when the recordings of the annual Chaos Communication Congress are put online. On the other hand, the CTAN mirror that was on rank 10 last year did not make the Top 10 this year, missing it by a few TB (106).

A notable new entry that would be on rank 11 is F-Droid, a community-maintained Android software repository only hosting free/libre software (sort of an alternative to Googles “Play Store”). We started mirroring this at the end of 2018, but only recently they added functionality that makes the clients use the mirrors more often. As a result, this mirror has seen some a lot more usage in recent months, and is likely to reach the Top 10 in 2020.

Mirrors generating most traffic in 2018

The total traffic of ftp.fau.de increased from 2.51 PB in 2017 to 3.22 PB in 2018, a 28% increase.

Rank Mirror Traffic 2018 in TB Rank/Traffic 2017 (for comparison)
1 kiwix (offline Wikipedia) 476 1 / 371
2 qtproject (Qt Toolkit) 445 2 / 274
3 lineageos (Free and Open-Source Android distribution) 327 3 / 203
4 eclipse 266 7 / 147
5 mint/iso 255 5 / 190
6 fedora 137 8 / 103
7 opensuse 119 5 / 152
8 osmc (Open Source Media Center) 113 6 / 149
9 centos 109 11 / 69
10 ctan (comprehensive TEX archive network) 95 10 / 93

Even though the absolute amount of IPv6 traffic increased a bit, its percentage of all traffic stagnated, with 22% (0.72 PB) in 2018, after 22% in 2017, 14% in 2016 and 11% in 2015.

Our mirror of cdn.media.ccc.de is no longer in the Traffic Top 10, it only ranked in 11th place with a measly 94 TB, a large part of that at the end of December / beginning of January, when the recordings of the annual Chaos Communication Congress are put online.

Downtime on January 12

We had a little downtime today between around 12:10 and 13:20. About 5 minutes of this downtime were planned – we simply wanted to reboot the machine. Unfortunately, the machine did not come back up after the reboot, and it took a while to figure out what the problem was.

As it turns out, the machine was unable to mount the huge volume with all our mirrors on it during boot. Manually mounting the disk failed as well because the device just wasn’t there. In lvdisplay the volume was listed as unavailable, and trying to set it available with lvchange failed with the message
/usr/sbin/cache_check: execvp failed: No such file or directory
Check of pool bigdata/ftpcachedata failed (status:2). Manual repair required!

As written in a previous post, we nowadays have a cache SSD in ftp.fau.de. As it turns out, you can create a cached volume and use it without any issues at all, until you try to reboot – because then LVM suddenly decides it needs a cache_check binary that naturally isn’t shipped in the LVM package. Of course, this is not a new problem: There is a bug report about that in Debian since 2014. Of course, slightly more than 3 years later, the problem still isn’t fixed (e.g. by checking if the cache_check-binary is available on creation of a cached volume). The problem is that the missing binary is in the thin-provisioning-tools-package, which to maximise confusion belongs to LVM, but doesn’t have LVM anywhere in its name. I also wouldn’t exactly associate caching with thinly provisioning volumes, but maybe that’s just me. The LVM2 package does not depend on thin-provisioning-tools, it only “suggests” it, so it doesn’t get installed automatically in any sane APT config for servers.

So once the problem was clear, it was at least easy to fix: We installed the missing package, rebooted, and ftp.fau.de was back in action.

Experiments with LVM-cache

Recently, we’ve frequently reached the I/O capacity of our RAID array during peak hours, meaning that increasingly often downloads were not limited by network speed, but by how fast our disks could deliver the data.

While we use a hardware RAID6 which does deliver pretty decent read speeds, we’re still using traditional magnetic hard drives, not SSDs. While these arrays easily deliver more than a Gigabyte per second when reading sequentially, this performance drops rapidly the more random the I/O gets, i.e. the more different files are requested at the same time. Of course, due to the nature of our FTP, which provides mirrors for a whole bunch of different projects, we do get a large amount of random I/O.

One solution to improve performance in this constellation is to use an additional cache on an SSD that caches the most frequently requested files or disk blocks. Most storage vendors implement something like that in their storage arrays nowadays, although it’s usually an optional and not exactly cheap feature – you usually pay a hefty license fee for that feature, and then you also have to buy prohibitively expensive SSDs from that vendor too to make any use of it. The better (and cheaper) alternative is to use a software solution to do the same thing. There are different implementations for SSD caching on Linux.

The SSD-caching implementation we chose to use was lvmcache. As the name suggests, lvmcache is integrated with the Linux Logical Volume Manager (LVM) that we use for managing the space on the big raid arrays anyways. The SSD is simply attached to a normal logical volume and then caches accesses to that volume by keeping track of which sectors are used most often and serving them from the cache-SSD.

Two basic modes are available: Write-back and Write-through. Write-Back writes blocks to the SSD-cache first and only syncs them to the disks some time later. While write-back has the advantage of increasing the write speed, it has the drawback that in this mode the SSD used for caching is vital – if it dies, the data on the disks would be left in an inconsistent and possible irrepairable state. To avoid data loss, some RAID level for the SSD would be required in this mode. However, we don’t really care about the write speed on the FTP – the only writes it sees are the updates of the mirrors, but those are few. More than 99% of all I/O is read requests from clients requesting some mirrored files. Because of that, we don’t really care about the write speed and instead use “write-through” mode. In this mode, all writes go to the underlying disk immediately, the SSD is only used for read caching. When the SSD dies, you lose the caching, but your data is still safe.

For testing, we borrowed a 1 TB Intel SSD. After one week of testing, we are impressed by the results. The following is a graph from our munin showing the utilization of the devices:

As you can see, we introduced the disk cache (nvme0n1) on the 7th. After about one day, the cache had filled up, and was now serving the magnitude of requests. As a result, the utilization on the disk arrays (sdb, sdc) dropped rapidly, from “pracitcally always 100%” to “30-50%” during peak hours.

If longterm performance is as good as these first test results suggest, we will permanently equip ftp.fau.de with a SSD for caching to allow faster downloads for you.

Mirrors generating most traffic in 2016

It seems I haven’t posted the traditional annual mirror stats for 2016 yet. Well, lets fix that: Here are the most used mirrors on ftp.fau.de in 2016:

Rank Mirror Traffic 2016 in TB Rank/Traffic 2015 (for comparison)
1 qtproject (Qt Toolkit) 264 1 / 199
2 kiwix (offline Wikipedia) 262 2 / 179
3 osmc (Open Source Media Center) 172 – / 35
4 opensuse 145 4 / 139
5 mint 134 3 / 179
6 eclipse 131 6 / 97
7 fedora 115 5 / 100
8 cdn.media.ccc.de (Talk recordings from CCC conferences) 91 7 / 83
9 ctan (comprehensive TEX archive network) 76 8 / 63
10 tdf (The Document Foundation – LibreOffice) 58 9 / 48

Comparing this list to last year, the first thing one notices is the new entry on rank 3: OSMC generates a steady amount of traffic, with visible peaks whenever they do a release. They were not in the 2015 top ten because we only started mirroring it in Q4 of 2015.

Linux mint dropped from rank 3 last year to rank 5 this year, with much less traffic than last year. And that is despite a slightly different counting that should actually have increased their numbers: We are now summing up the two parts of the mirror, the ISOs and the packages. Not because we want to do that, but for technical reasons – we cannot always distinguish between the two in the stats, and not summing them up would make the stats even more wrong. Now judging from the stats, we must have been dropped from their mirror list for ISO downloads around June of 2016. From there to the end of the year, almost no requests for the Mint ISOs have hit our server. As to why we were dropped, we haven’t got the slightest clue – we got no notification about the removal. We did get a notice about being readded at the beginning of 2017 though.

Last years rank 10, videolan, dropped off the list – it would be on rank 13 this year.

Rank 1 and 2, Qt and kiwix, are really close head to head.

For all other mirrors, they swapped positions here and there, and all of them generated a little more traffic than the year before, but there were no big changes.

Lets take a look at IPv6 traffic only:

Rank Mirror IPv6 Traffic 2016 in TB Rank/Traffic 2015 (for comparison)
1 kiwix (offline Wikipedia) 34.6 3 / 14.5
2 cdn.media.ccc.de (Talk recordings from CCC conferences) 27.6 2 / 15.2
3 mint 20.8 1 / 22.6
4 qtproject (Qt Toolkit) 19.1 5 / 11.5
5 opensuse 18.4 6 / 10.8
6 debian 17.3 7 / 9.3
7 fedora 13.0 4 / 11.5
8 pclinuxos 12.9 – / 0.7
9 ubuntu 12.6 – / 3.2
10 tdf (The Document Foundation – LibreOffice) 12.3 8 / 8.1

With the exception of Linux Mint, where I’ve already explained the reason above, all mirrors had more IPv6 traffic, sometimes significantly more.

This is also visible in the total IPv6 traffic over all mirros: 13.68% of all traffic in 2016 was IPv6, up from 10.54% in 2015.

There are still huge differences in the IPv6 traffic share between the different projects mirrored, and most of the time it isn’t really clear why. One example where it is clear though is cygwin, with an IPv6 share of pretty much 0%: They use a setup-tool that downloads individual packages from the mirrors, and it seems this tool only does IPv4.

Mirrors generating most traffic in 2015

Here are the most used mirrors in 2015:

Rank Mirror Traffic 2015 in TB Rank/Traffic 2014 (for comparison)
1 qtproject (Qt Toolkit) 199 1 / 172
2 kiwix 179 – / 0
3 mint/iso (Linux Mint ISOs) 177 10 / 31
4 opensuse 139 2 / 106
5 fedora 100 6 / 44
6 eclipse 97 4 / 86
7 cdn.media.ccc.de (Talk recordings from CCC conferences) 83 8 / 32
8 ctan (comprehensive TEX archive network) 63 7 / 40
9 tdf (The Document Foundation – LibreOffice) 48 5 / 48
10 videolan 40 3 / 88

Kiwix on rank 2 has only been mirrored since March 2015, so it generated that amount of traffic in only 9 and a half months – it might well take over the top spot next year. The Ubuntu release ISOs dropped out of the Top 10 for IPv4. As is to be expected, the CCC conference recordings generate extremely peaky traffic after CCC events – setting a new record high with 11.3 TB in just one day on December 30 (after 32C3). It might even have made a little bit more, but unfortunately the webserver ran out of threads because someone apparently distributed web-torrent-files with 64 KB of chunksize.

The list is slightly different for IPv6.

Rank Mirror IPv6 Traffic 2015 in TB Rank/Traffic 2014 (for comparison)
1 mint/iso (Linux Mint ISOs) 22.1 10 / 3.1
2 cdn.media.ccc.de (Talk recordings from CCC conferences) 15.2 6 / 4.7
3 kiwix 14.5 – / 0.0
4 fedora 11.5 7 / 3.6
5 qtproject (Qt Toolkit) 11.5 1 / 9.1
6 opensuse 10.8 3 / 6.6
7 debian 9.3 5 / 5.4
8 tdf (The Document Foundation – LibreOffice) 8.1 8 / 3.4
9 eclipse 7.4 4 / 5.8
10 ubuntu-releases (Ubuntu CD Images) 6.0 9 / 3.2

IPv6 traffic is up – from 7.86% over all mirrors in 2014 to 10.54% in 2015. Absolute IPv6 traffic is also up in 2015 for all mirrors in the Top 10. The share of IPv6 traffic varies greatly between the mirrors.

Changing filesystems: From XFS to EXT4

Since we moved the ftp to the (then) new hardware in October 2013, we had been using XFS as the filesystem on which we store all our mirror trees. All data resided in one single large XFS filesystem of 35 TB. That worked rather well until we updated the operating system on the machine in January.

After the update, the machine behaved extremely unstable – and in crass contrast to its 460 days uptime before the update. Sometimes all file I/O would stop for a few minutes (up to 30), and then suddenly continue as if nothing had happened. Those hangs happened in average twice a day. Sometimes the machine would also lock up completely and need to be reset.
During the hangs, the machine would log many of the following error-messages to the kernel log:

XFS: possible memory allocation deadlock in kmem_alloc (mode:0x8250)

The problems soon were traced to the new kernel 3.13 that came with the update. Simply booting the old 3.2 Kernel from before the update again would get rid of the weird hangs and lockups. We first tried to update the kernel to 3.16, but to no avail – it showed exactly the same problem.

Apparently, there was a bug in newer XFS versions. So I took the problem to the XFS mailing list. The responses were surprisingly quick and competent: Apparently there are situations where XFS will need large amounts of unfragmented kernel memory, and when such memory is not available, it will essentially block until there is. Which might be “when hell freezes over”. Until then there is no I/O to that filesystem anymore. The suggested workaround of increasing vm.min_free_kbytes and similar settings so the kernel would be more likely to have enough memory immediately available for XFS did not work out, hangs were still happening at an unacceptable rate – a FTP server that is unavailble for 2 times 10 minutes a day isn’t exactly my idea of a reliable service for the public. The XFS developers seemed to have some ideas on how to “do better”, but it would not be simple. I did not want to wait for them to implement something – and even after they did I would still have to run a handpatched kernel all the time, which I was trying to avoid. So a switch of filesystems was in order. That was probably a good idea, because according to a recent post on LKML the situation is still unchanged and no fix has been implemented yet.

We decided to switch to EXT4 in 64bit mode. Classic EXT4 would not be able to handle a filesystem of 35 TB, but in 64bit mode it can. That mode was implemented some time ago, but until recently the e2fsprogs-versions shipped with distributions were unable to create or handle filesystems with the 64bit option.

To switch filesystems with as little downtime as possible, we had to pull a few tricks.
Unfortunately, it is not possible to shrink XFS-filesystems, so we could not simply shrink the XFS filesystem to make room for an EXT4 filesystem. So first, we temporarily attached another storage-box with enough space for the new filesystem. We created a LVM-volume on it and an EXT4-filesystem in it. We then started to rsync all files over. That took about 3 days, but of course happened in the background without downtime. Next step was to temporarily stop all cronjobs that would update the mirrors, do another final rsync run, and then unmount the old filesystem and mount the new one. That naturally caused a few minutes of downtime, but still went without too many users noticing. We then killed off the old filesystem, and used LVM’s pvmove to move the data from the temporary storage-box back to the space that was previously occupied by the old filesystem. This again happened in the background and completed in about a day. We could then remove the temporary storage-box. So far, we had done the whole move with less than an hour of downtime. The only thing left to do was that we still had to resize the EXT4 to fill all the available space – the one created on the temporary storage-box had been smaller because it had a smaller capacity than our regular RAIDs.

This was where we hit the next snatch: Running resize2fs to do an online-resize of the filesystem would just send resize2fs into an endless loop. It turned out this is another known bug for large 64bit EXT4 filesystems: Apparently nobody had ever tested resizing 64bit EXT4 filesystems that actually used block numbers larger than 2^32, which is why both online- and offline-resize-functions would try to stick a 64 bit block number into 32 bit and then naturally explode. Lucky for us, it just went into an endless-loop instead of corrupting the filesystem by trimming some 64 bit block numbers to 32 bit…

As fixing the online-resize would again mean to compile and run a handpatched kernel, the only option really was to do an offline-resize with a recent (>= 1.42.12) e2fstools-version. Unfortunately, that meant another downtime of a little over an hour for resize and fsck. But in the end, it was successful.

After 8 days and numerous obstacles, we have successfully moved from XFS to EXT4 without data loss.

Mirrors generating most traffic in 2014

While we publish daily stats for all of our mirrors at https://ftp.fau.de/cgi-bin/show-ftp-stats.cgi, there is currently no aggregation over a whole year. So here are the most used mirrors over the whole year:

Rank Mirror Traffic 2014 in TB
1 qtproject (Qt Toolkit) 172
2 opensuse 106
3 videolan 88
4 eclipse 86
5 tdf (The Document Foundation – LibreOffice) 48
6 fedora 44
7 ctan (comprehensive TEX archive network) 40
8 cdn.media.ccc.de (Talk recordings from CCC conferences) 32
9 ubuntu-releases (Ubuntu CD Images) 32
10 mint/iso (Linux Mint ISOs) 31

The list is slightly different for IPv6.

Rank Mirror IPv6 Traffic 2014 in TB
1 qtproject (Qt Toolkit) 9.1
2 videolan 7.4
3 opensuse 6.6
4 eclipse 5.8
5 debian 5.4
6 cdn.media.ccc.de (Talk recordings from CCC conferences) 4.7
7 fedora 3.6
8 tdf (The Document Foundation – LibreOffice) 3.4
9 ubuntu-releases (Ubuntu CD Images) 3.2
10 mint/iso (Linux Mint ISOs) 3.1

Still only a small fraction of the traffic is via IPv6, 7.86% over all mirrors and the whole year. CTAN for some reason drops out of the IPv6 list (it would be at #17), its Top10 slot is taken over by Debian (#16 in the total list) which has an unusually large IPv6 share. That large share can be attributed mostly to the local Computer Science Department which has large pools with IPv6 capable Debian machines.

experimental geo-ip-statistics

Since the beginning of September, we’ve been creating experimental GeoIP-statistics to see where the users of ftp.fau.de come from. We’re drawing nice maps from the results. You can find these stats from our general ftp statistics page – for days where GeoIP-stats are available, you will find a link to them at the bottom of the page. Here is an example:

geoipstats-ftp.fau.de-20141229-worldmap

As you can see, unsurprisingly most of our users come from Germany. Besides the map there is also a table that show the respective percentages per country – in the map above, Germany accounts for about 50 percent of all accesses. Note that is counting the number of bytes transferred, not the number of accesses.

We also have a map from the stats per Bundesland within Germany:

geoipstats-ftp.fau.de-20141229-germany

However, this second map is really mostly guesswork for a number of technical reasons:

  • To do the actual dirty mapping work, we’re using GeoLite data created by MaxMind, available from http://www.maxmind.com. This is a free-to-use version of their commercial GeoIP-database, and since they need to make a living from their database, the free version is of course less accurate than the version available to their paying customers. It just returns “unknown” for the Bundesland in roughly about 30% of all cases.
  • Due to technical limitations on our side, we’re currently using a rather dusted version of the geoip-mapping library, that for example still has problems mapping IPv6 addresses (although it works sometimes). The quality of the mappings should improve significantly after we update to their current version in the first half of 2015.
  • Our logs are normally anonymized, i.e. we do not store the full IP address of a client, but trim (at least) the last octet of an IPv4 IP. This further reduces mapping quality.

Still, we do think it is a rather interesting experiment. While we were expecting most of our users to be from Germany, we found the actual percentage suprisingly high – we had expected some more usage from other european countries.