Similar to the thought process for debugging a performance problem, here’s one for debugging a threading issue:
- In Production, we have a problem where the threads in a forum listing disappear showing only one or two threads in their place. We found a workaround pretty quickly (clear the cache), but want to stop the problem from occurring.
- For the better part of a week, we found it puzzling. A couple moderators noted that the problem seemed to crop up after moving a thread from one forum to another.
- We still haven’t been able to reproduce the bug on demand. It sounds like something threading related. These tend to be the mysterious looking bugs.
- All right. I’m going to sit down and look at the relevant code now.
- I know the problem has something to do with the TopicRepository class which is where this data is stored.
- I see clearCache method that empties the cache is in fact called when we move threads.
- I also see addAll resets the forum list. This seems to not be called since we only see a post or two. Hmm. Let’s see where this is called. I see; it’s called when someone requests to view the list of topics in a forum. The code gets the topics from the cache. If there aren’t any topics cached or the cache reports as “not loaded” for the forum, it gets them from the database and loads the cache. Clearly this isn’t happening.
- I also see a method that adds a single topic to the cache. It is called when we post announcements. This looks promising.
- I tried locally to reproduce the bug and got it! We weren’t posting announcements though. Then I searched the code for other calls to this method. There is another call when we “update board status” which happens after someone replies to an existing thread. Eureka! This must be what’s happening.
- Now I can reproduce on our sandbox server on demand. (see path below.)
- How to fix. Well the topics being non-empty doesn’t seem like something I can control. Looking at the other part – the loaded flag – seems more promising.
- I see the flag isn’t cleared in the clearCache method. I’ll add that in and see what happens.
- The bug goes away! Great news!
The path that causes the bug to manifest:
- open two threads in the Test Forum
- move a thread from Test Forum to Another Test – this clears the cache
- in Test Forum, add a post to an existing thread – this adds one topic to the the list
- reload list of threads for Test Forum – now it thinks there is one topic in the list and doesn’t look for more
- you only see the one edited thread in the list