I really dislike it when ill-informed people write blogs, articles, and even books about things they know really nothing about. In the C# world, there a couple of topics that are the subject of literally thousands of ill-informed blog and forum posts, for which the real, correct advice is buried and lost in the noise and chatter of supposed “experts” offering words of “wisdom” to newbies out there. The couple of topics that I see come up again that I find particularly upsetting are whether or not it is okay to ever call “lock(this)”, and whether or not, and how-often, anyone should call “GC.Collect()”.
Warning! I’m going to completely come as an asshole here. I’m just in a mood, because I don’t like it when people walk into my office trying to claim that I’m doing something wrong, simply because some bad advice exists in the official documentation…. and the misinformation out there regarding these two topics seems to have been parroted thousands and thousands of times all over the internet, so I feel like I need to shout a little louder than the rest of the people out there.
1) “lock(this)” basically imposes a critical section lock on the current object, however the advice out there, including official documentation from Microsoft, claims that you should “never” do this. I will call out the bullshit on this advice.
2) “GC.Collect()” forces the garbage collector to collect the junk that is laying around. This official MSDN blog claims that the 1st rule of when to call GC.Collect() is “Don’t”, but then proceeds to completely contradict itself in rule #2. Again, I will call out the bullshit on this advice.
When reading stupid advice like this, it is important to understand who the authors’ audiences are. Only by deducing who the intended audience is can you truly understand the authors’ intended meaning of the words “never” and “don’t”. If I followed my mother’s advice when I was a child when she said “never ever ever touch the stove!”, then as an adult, I’d have one gross, disgusting stove, because I would never know that is okay to touch the stove when it is not hot, and sometimes necessary when you want to clean it.
In the case of #1 and #2 above, I’m fairly certain that the target audience is corporate hacks who spend their days churning out boring form after form, probably in HTML format, churning out XML documents, and checking off todo-items in their CRUD matrices (don’t get me started on CRUD). These are the simplest of programming tasks, intended for simpletons who have no taste for software architecture and leave their desks promptly at 5:00pm to go home and watch sports and/or reality television. To those people who have no understanding of the innards of what it is they’re building, you should never trust them to do any kind of multi-threaded code, or build anything other than the most basic of boiler-plate class libraries, therefore, whether or not to use lock(anything) should never cross over into their pay-scale. They follow simple instructions outlined in corporate training documents that read something like “Step 1). Choose add new->C# Class Step 2)…. ”
This blog post is not intended for that audience, but for the people out there that actually want to understand why and when to do these things that everyone says NOT to do. If you read the most accepted answer as to why lock(this) is bad on Stack Overflow, you’ll find that it boils down to a design philosophy discussion. In some designs, even good designs, I will argue that there are many situations where lock(this) is a good idea and the words “never” should not have entered any official Microsoft documentation. The official documentation should contain only facts, not opinions, and putting opinions among a factual document sends the wrong message. I’ve worked on C# projects in the past where my coworkers would actually come in to my office and complain that I was “doing it wrong” referring to some code that I wrote containing “lock(this)”. I promptly put them in their place.
Take this simple hypothetical class, for example. This example is basically a shared list of objects, useful in very common situations where you might have two threads competing for a list of stuff without stepping-on each other or crashing the app. Maybe the list is too big to be copied among threads (private copies are typically more ideal). There are all kinds of scenarios where this list might actually need to be truly shared and in shared memory, so don’t go flaming me like (idiot below) did simply because I’m using a simple example. Maybe in real life the actual implementation of this list isn’t just a List<> but is something more complex like a B-Tree, AVL-Tree, Maybe a hybrid AVL/Linked-list supporting duplicate keys and multiple roots allowing 64-cores to independently sort the tree roots.. efficiently maintaining a 192GB RAM cache (which is my favorite list)… but for the sake of simplicity… let’s just call it a friggen List<>… but one that is shared.
public class SharedList { List<SomeObject> _list; public int Count { get { lock(this)//<---coding standards clearly violated { return _list.Count; } } } public SomeObject this[int idx] { get { lock(this)//<---I'm a bad bad man { return _list[idx]; } } } public void Add(SomeObject obj) { lock(this)//<--- So bad, horrible. { _list.Add(obj); } } public void Clear() { lock(this)//<--- Breakin' the law { _list.Clear(); } } }
The advice out there is “You should never call lock(this). Instead call lock against a private variable.” In the above example, I have clearly violated this advice. Why did I do this? Why didn’t I call lock(_list) instead of lock(this). Well, the simplest answer is if I were using an instance of SharedList, I might want to impose an external lock on it. I might declare a variable, possibly a singleton that is of type SharedList and then be able to perform multiple operations on this object without it being corrupted by another thread.
The above example is thread-safe within a single operation. Add() is thread safe, Delete() is also thread-safe, and the this[] array property is also thread-safe. In this design, if you, as the user of this class, want to add 5 objects to the list, uninterrupted by other threads, you can call lock on the object yourself before manipulating the list to keep others out. In this scenario, lock(this) is clearly the best choice, because calling lock on an internal, private object cannot be done externally, thereby offering the user no interface to add multiple items without writing some unnecessary additional code. For these scenarios, it is good to have the option to lock the list externally. You could argue that maybe I should create a version of Add that accepts and array[] of SomeObject… that may work for certain designs, but it might not work for other designs… or it might just be a design preference. In my tool belt, I have lots of tools for doing the right thing at the right time… and different things are appropriate at different times due to the way the calling code might be structured.
But assuming that this class is the most appropriate form for what I need… If I called lock(_list) instead of lock(this), I would have no way of externally performing multiple subsequent operations on this object without risking potential corruption from other threads unless I exposed public void Lock() and public void Unlock() functions that internally called monitor.Enter(_list) and monitor.Exit(_list) internally… which in a nutshell is exactly the same damn thing.
Avoiding calls to lock(this) does not actually resolve any technical problems. My programmer intuition tells me that the official facts offered by Microsoft on this subject are total bullshit, and I have never seen any evidence of instability, or deadlocking due to my use of lock(this). There is no difference between calling lock(whatever) from the outside the class and calling lock(this) from inside the private scope of “whatever”. Literally calling lock(whatever) is in essence the same as calling lock(this)… the only difference is the context, therefore no one can try to convince me that calling lock(this) is somehow unsafe.
I’ve seen other arguments out there like, “you’re not supposed to call lock(this) because it is ‘bad software design'”. To that I say, “who the hell are you to tell me what is or isn’t good software design?” I think that when you’re designing software you take different design approaches into account depending on the context and requirements of each particular class you’re designing.
Take, for example a situation where you have a class with two private references to shared objects which we’ll call “A” and “B” for simplicity. The whole class and therefore A and B are both accessed by many threads. In an ideal world, you would lock(A) when you need access to “A” and lock(B) when you need access to “B”… and should only keep them locked during the hopefully-brief amount of time you need to make the function calls you need.
First of all not all scenarios are quite so “ideal” as this one. What-if the changes made to B need to coincide with the changes made to A? Well… then you’d have to lock A… then B. Or is it B then A? Be careful to always lock them in the same order… always! Because if you don’t, there might be a rare case in which two different threads cause a deadlock. You might want to add some comments, issue some corporate memos, or email “everyone@yourcompany” to warn them not to do it in the wrong order.
But what if some new-guy comes around and doesn’t get that memo and writes a function that locks B then A? Well… then you have a deadlock… and if you’re lucky, it deadlocks often and predictably, else have fun staying up all night while your data-center operator calls you complaining about your newest deployment crashing all the time.
In this scenario, it might be a “safer” design for you to have just ONE lock that covers both A and B, so why not lock(this)? It may be more important for my clients to have STABLE code, rather than fast code, and readability often equals maintainability.
The Control Data engineers I used to work with used to always say “First write code that works, then make it work fast.” In agile software development, this, in my opinion, is good advice. Agile refactoring is standard practice. To paraphrase my wise colleagues, I say, “Write something that works, that is safe, that is easy to read, easy to maintain, and meets all the requirements, then refactor it for speed only if it is determined that it totally necessary.” With this advice in mind, I typically would choose the broader lock over the finer one, if only for safety reasons. I write multi-threaded apps that can get quite complex, with many gears moving at the same time. I don’t write corporate accounting software…. I write complex engines for audio and video mixing, effects, intelligent lighting control, massive disk storage arrays, and data warehousing….. and no matter how careful I am, the occasional deadlock still comes up. The best advice I can give to a programmer doing typical, corporate tasks is, keep it simple, put a single lock on the whole object to make it less prone to deadlocks and easy to maintain next year when you’ve forgotten about everything you wrote. Only grease the gears if it is part of backbone of the performance of your application.
In today’s reality, the only time you need to speed up anything is if it is something that will be called thousands of times, or even 10s of thousands of times in a short time frame. Try writing an iSCSI server for example, which needs to simulate a disk-drive over the network… then you’ll need to optimize the hell out of everything… but your piddly little corporate IT business app… not so much. Unless you need the performance boost, you should write easy-to-read, simple code, and most performance optimizations can be accomplished by simply executing LESS code to do a particular task… or in the case of a giant corporation, throw money and servers at it, design your system with isolated, scalable lanes (the bowling alley approach). As far as I can tell, there is simply nothing technically wrong with lock(this).
From what I can tell, the architects of the C# language are clearly not all that enthusiastic about threads. The C# classes for dealing with Threads are just as poorly thought out as the old Microsoft C++ APIs, and I’m pretty sure the Microsoft official line regarding “lock(this)” came from someone other than their most brilliant engineers… and probably came some documentation hack who doesn’t actually touch the architecture.
Let’s face it, C# .Net was not designed to be a multi-threading enthusiast’s wet dream. Automatic reference counting is a performance nightmare for multi-core CPUs, and the language as a whole is unfortunately heap-intensive. Thankfully, the “unsafe” keyword and a good AOT compiler can get around some of the bullshit automation.
2) Regarding GC.Collect(). They say “don’t” call GC.Collect(). That would be just great if the garbage collector had half a brain. The C# garbage collector is pretty horrible and overbuilt, however, and doesn’t really know how or when to clean up really important things. All the optimization of memory usage would all be better accomplished with a fancy multi-threaded heap manager like Big Brain rather than working with entire objects in the way it does. For example, what if I bury a UDP listener inside a whole mess of intertwined objects and then simply dereference the outermost object. In any automatic reference counting (ARC) scenario, once that outermost object’s reference count hits zero, it is supposed to destroy itself, also removing references from all the inner objects as well, correct? Well, the C# garbage collector, at times, decides to take its sweet merry time and may not release that object for a very long time… as-in… never… and that listener, therefore, might still actually be listening when I don’t want it to be. It doesn’t even seem to be smart enough to know that maybe it should pick up the pace during application shutdown. It isn’t smart enough to understand that objects that consume external resources, like file-handles or sockets, probably should get cleaned up right away without delay.
I see problems all the time where applications refuse to shutdown because the garbage collector won’t destroy certain objects, leaving them in a waiting state, Dispose() sometimes doesn’t help or cannot help you. You might try and suggest that maybe I didn’t set all my instances to null, properly maintain my [weak] references… or verify the non-existence of circular references. Maybe, you think I’m just a dufus. But GC.Collect(), by law, cannot help me if there are circular references, so all those possible scenarios would be ruled out if CG.Collect() actually helps.
I feel like it should at least be smart enough to know that it should pick up the pace when the application is shutting down, but GC.Collect() is particularly poor at cleaning up objects when there are background threads involved, even if I carefully stop and cleanup all references to my background threads, I find that it often takes a GC.Collect() to actually finalize the cleanup.
Essentially the C# Garbage collection system is only marginally useful for cleaning up local variables in a function. It is a poorly thought-out, poorly implemented, homage to Java that doesn’t really even need to exist.
Okay… I’ve got that off my chest… back to work!
all this ego and cringe walls of text just to say that you made a wrapper for a thread safe list?
I can’t believe I actually read this post, you should consider life choices and stop being a waste of space, leave oxygen for people who don’t waste it
Well, clearly whatever demographic you belong to was a target of this post! Clearly I make lots of things, a thread-safe list seemed a an example good enough for whoever it is that I believe needs some convincing. Happy coding!
I appreciate your robust defense of “lock(this)” and “GC.Collect()”. As you rightly argued, there’s often a discrepancy between the theory promulgated through official documentation and the practical realities we face when programming. Understanding the intended audience of these instructions is key. Still, it might save a lot of heartache for developers like us if it was explicitly stated that these are guidelines rather than hard rules. After all, one size doesn’t fit all in software design – wouldn’t you agree?
Oh, the irony, Giselle! You hit the nail right on the head. They oughta slap a “guideline, not gospel” sticker on those docs. Flexibility’s the name of the game in software land. Anyone claiming absolute dos and don’ts tends to be a bit outta touch, I’d say.