Why you shutting down an application gracefully even though it’s difficult and you think you don’t need to

UrBackup is a highly threaded application. Currently, for every client a server has approximately 5 threads are started. Additionally there is a thread pool for requests to the webinterface.
Some people would claim this is a difficulty in itself, because managing resources can be difficult the more threads you have. Anticipating this I built UrBackup using message passing and the implicit synchronization it provides.
Partly because of this I didn’t have that much problems.
What is a problem is how to tear down all those threads again. One solution which is currently “in use” is to let the operating system handle that. Let me elaborate on this:

The assumption is that you designed your application in a way that it can be forcefully stopped at any time. You should do that for every application that saves some kind of data, because nobody can guarantee for any length of time, that your procedure which stores data will not be interrupted and not started again. This interruption can be caused by a shortage of memory, a user killing the process or a power outage/hardware failure killing the computer.

Now designing an application like this can be hard. For example you write some settings to a file. You cannot write it directly there, because this write could be interrupted, causing an invalid settings file. The common pattern in such a case is to write the settings to a temporary file and to rename that file to your settings file. This can be done because the POSIX standard for filesystems defines that the rename operation is atomic – your filesystem guarantees that the rename either happens to the whole file or not at all (There are some different interpretations of this standard, however, which caused some issues e.g. with btrfs).

Now doing this with a lot of data is of course not efficient enough. You should use some library which does that for you. UrBackup uses sqlite (Which is an embedded database).

So you painstakingly did everything in a way that your program can be stopped at any time and no data is corrupted, or too much data lost. Do you still need to shut down your program gracefully? Assuming that only data is lost, that can be regenerated: No.

You can approach the problem from the other side as well. Assuming that you have time t during shutdown to write out data so nothing important gets lost you have to write your program in way that only so much data accumulates that you can still save it within t. Because sometimes you don’t now what t is or how long it is going to take to save your data you have to save important data as fast as possible. This means that the data which you would write at shutdown is not important – saving it is not necessary: You do not need save anything at shutdown.

So if every application does something like that, why does e.g. Windows not encourage the users to turn off the computer by removing their power source?

The first point here is of course that not every application does that. The second point is that sometimes you do not want them to do it: E.g. your laptop disk spins down. The application saves a little bit of data: The disk has to spin up again, in order to save the data.

The third and most important point is that often it cannot be guaranteed, that with one hundred percent probability the data is not corrupted by something like this. This can be the disk’s firmware/hardware fault, the filesystem driver’s fault or your programs fault. Do you still remember Windows laboriously checking your filesystem after you shut it down the hard way (It still does it some times)? This was in order to restore the integrity of the filesystem. Now the filesystems have gotten better at guaranteeing that integrity after a forcefull shutdown. But not with onehundred percent probability. It could guarantee it, but only with a performance penalty (that’s why it’s not done).

UrBackup has the same problem. It is only with a probability close to onehundred percent guaranteed that it’s database is not corrupted after a shutdown. There would be an option to further reduce this very small probability – with a huge performance penalty.

Of course the opportunities of such a corruption occurring should be minimized. Which means when possible one should not rely on the assumption that your program can be interrupted at any time.

That means for UrBackup, that I will have to sit down and work on a clean shutdown at some point in the future.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.