CloseHandle vs. ReleaseMutex

Given that Win32 mutexes are "abandoned" when the owning thread exits without releasing them, you might make the mistake of assuming that closing a mutex handle without releasing it would have the same effect, but you'd be wrong. Suppose the following sequence of events:
  1. Process PA creates and owns named mutex M.
  2. Process PB "creates" a mutex by the same name. (GetLastError returns ERROR_ALREADY_EXISTS, so PB knows it didn't really create M, but that's beyond the scope of this discussion.)
  3. PA has done whatever it needed to, decides it is done with M, and closes its handle to M. M still exists, of course, because PB still has a handle to it. But M has just entered a "limbo" state in which it's useless to everyone and everything.
  4. PB attempts to "own" the mutex by passing it to WaitForSingleObject, and let's suppose for the sake of discussion that PB passes an INFINITE time-out to WaitForSingleObject.
Here's an alternate sequence of events:
  1. Process PA creates and owns named mutex M.
  2. Process PB "creates" a mutex by the same name. (GetLastError returns ERROR_ALREADY_EXISTS, so PB knows it didn't really create M, but that's beyond the scope of this discussion.)
  3. PB attempts to "own" the mutex by passing it to WaitForSingleObject, and let's suppose for the sake of discussion that PB passes an INFINITE time-out to WaitForSingleObject.
  4. PA has done whatever it needed to, decides it is done with M, and closes its handle to M. M still exists, of course, because PB still has a handle to it. But M has just entered a "limbo" state in which it's useless to everyone and everything.
In both sequences, PB is now stuck; WaitForSingleObject will never return. The only difference is that I expected PB to block in the second sequence after step 3 but then unblock after step 4. It's possible PA could "create" another handle to the mutex and pass the handle to ReleaseMutex, and I didn't try that, but it would be consistent with the notion that CloseHandle closes a handle to the mutex object, which exists in the kernel independently of the handle. Still, PA "creating" the handle again would be weird, and I expected CloseHandle to have the same effect as PA exiting, which was wrong.

Safari sees through filename extension lies

Here is the main text of a bug report I filed this morning with Apple:

If I use Photoshop to save a PSD file, and then rename that file so that its filename extension indicates it's a GIF file, I've just created a bogus file which most applications will not be able to figure out. Firefox and Internet Explorer display empty boxes. Photoshop itself cannot open such a file, even though internally the file is in Photoshop's native format.

Unlike most applications, Safari figures out that the filename extension is a lie and displays the image correctly. Ordinarily, I would pat Safari on the back for being better than everybody else. However, in this case, it can cause confusion.

Suppose you're learning web publishing and you make the mistake of thinking you can change a filename extension and the file will be magically translated into the corresponding format. This is a pretty common mistake among learners. If you are authoring a site with Safari as your test engine, you can "get away with" this mistake. However, when you test with other browsers, you discover your images don't work. You get empty boxes.

Here's why this is a particularly toxic situation: Part of learning web publishing is learning HTML. While you're learning, you make lots of HTML mistakes, and HTML mistakes resulting in empty boxes instead of images are quite common. When you see the empty boxes caused by the filename extension mistake, you assume there's a problem in your HTML, and you stare at it for hours, scratching your head, getting nowhere, because your HTML is perfect.

So, as annoying as it may seem, I am asking you to make Safari stupider. Please make it stop seeing through filename extension lies. No other app does it, and nobody makes Safari-specific sites, so there's no upside, and the downside is that it confuses people who are trying to learn web publishing.

endless WM_COMMAND/BN_CLICKED

When the user selects a radio button, the button sends a WM_COMMAND/BN_CLICKED notification message to the button's parent. This lets the program which owns the parent take action based on the new selection.

This works fine as long as nothing could possibly go wrong during said action, but of course that's almost never the case. Suppose the action involves allocating a lot of memory and the allocation fails; what then? A good program will know which radio button was set before the user clicked and restore that button (before or after informing the user of the failure).

Now the fun begins.

If the program restores the old radio button while processing the message, Windows* decides to send an identical message for that very same button as if the button had been clicked again. This doesn't happen right away; it's deferred in some tricky way, and believe me it's not worth trying to second-guess the deferral. A straightforward program is likely to find itself in an endless loop not of its own making. The user clicks the radio button, the program tries to allocate a bunch of memory, fails, restores the old radio button, Windows sends another message as if the user clicked the radio button again, the program tries to allocate a bunch of memory... and hilarity ensues.

The way out of this madness is to observe that the radio button is not set the second time the system informs the program that it was clicked. When the radio button is set, act; when it's not, don't. It's not intuitive that one must make this test, but one must.

* My testing was on Vista only.

Mac OS X 10.4 parental controls vs. Microsoft Office 2004

Tonight, I set up Microsoft Office 2004 to run in a Simple Finder configuration under the parental controls of Mac OS X 10.4 (Tiger), but it took a lot longer to get running than it should have. The first application I tried to enable was Microsoft Word. Although the parental controls window clearly showed Word as having been enabled, Word did not appear in the applications window when I logged in as the user in question. After many frustrating attempts to rectify the situation, I finally got lucky. I enabled the full Finder and launched Word, which went through its "first run" ritual for the user in question. Office applications, of course, need to perform this ritual once for each user. After I logged out and back in again, Word appeared in the Simple Finder window. My theory is that the parental controls in Tiger know that this configuration is not capable of supporting the first run of an Office application, and so they quietly hide the app from the user. This is a user-hostile way to handle the issue; I would have preferred to see Word try and fail to perform its first run ritual.

turn off Visual Studio 2005 automatic manifest resource generation in converted projects

When Visual Studio 2005 upgrades an old project, it turns on a setting which tells the linker to create a manifest resource in your program file. This resource seems to express a dependency on a DLL version of the C/C++ standard libraries. This causes the system to refuse to load your program unless said DLL is present. Your users will get an error indicating something about an invalid configuration and suggesting they reinstall. Apparently, the relevant libraries are not present on Windows XP SP2 (I haven't checked Vista), which causes the system to believe it should not even try to run some otherwise perfectly good programs.

This is a particularly vexing because some applications, including some bundled with Windows, install these libraries, and it can be difficult to figure out why some systems fail and others do not. One such application is, you guessed it, VS2005, so you will never be able to reproduce this problem on the system with which you built your program. Another seems to be Internet Explorer 7, so anybody who's got automatic updates turned on won't be able to reproduce this problem, either. Anybody who's still running IE6 (or earlier) without VS2005 probably will not be able to run your program.

Adding insult to injury, my programs already contain carefully constructed manifest resources and/or have carefully chosen linker options so that they don't take external dependencies which may or may not be present on deployment systems.

Fortunately, the fix was simple. There's a new linker option which controls generating this manifest resource. I turned it off and things went back to normal. It seems I could also have directed users to the Microsoft web page where they may download the C/C++ runtime libraries from VS2005. (Perhaps Microsoft ought to add these libraries to Windows Update.) Or, it seems, I could have redistributed these libraries with my programs (after deciding I am compatible with the relevant legalese).

I understand why Microsoft would want to add the capability of generating a manifest resource to the linker, but I don't understand why they would want to turn it on by default in such an annoying way. This cost me hours. It's partly my fault for not reading the upgrade report carefully; it did mention something about manifests. Had I been super-careful, I would have looked deeper into this potential problem and perhaps found the actual problem earlier. But since it didn't cause any trouble right away, I was lulled into a false sense of security that Microsoft had not done anything risky to my project.

VS2005 and vswprintf

When you upgrade an existing C++ program to Visual Studio 2005, the compiler will tell you any calls to vswprintf lack an argument specified by the C++ standard. Through the magic of C++ function overloading, your code still works, because Microsoft provides both the old-style proprietary three-argument function and the new-style standard four-argument function. The compiler tells you about the new function by means of a warning, and if you're a smart programmer, you've told the compiler to treat warnings as if they were errors in the sense that they prevent an object module from being generated, so there is a fair amount of motivation to adopt the new function so that the warning will go away.

The new parameter is the count of characters in the output buffer. After reading the documentation for this parameter, you might assume that, like the return value, the parameter does not include the terminating 0 character. But that assumption would be wrong; if you pass the number of characters you expect as a return value, the function will return -1 and errno will be set to ERANGE. You need to add 1 to the parameter (after, of course, making sure your buffer is large enough for the terminating 0). MSDN does not document this, and after having guessed at it myself, I searched the web until I found a Linux man page which told me my guess actually coincided with somebody's reality. I've chosen to assume Microsoft plays by the same rules.

Microsoft hell

This story confirms the sense I have gotten over the past few years that the Microsoft work environment has descended into hell. I personally know demonstrably bad people who've been hired there, and there is of course the ongoing suppurating sore that is Mini-Microsoft, and Ballmer is a blatant loser dork who desperately cries out for amputation. But the story puts it in terms geeks can understand.