Friday, December 4, 2009

WeakReference for lazy loading and releasing

One thing that always bothered me when working with nhibernate and lazy objects is the following problem:
hibernate will automatically generate a wrapper proxy for your objects and fill them only when a data is requested. so far so good, but what about when i want to get rid of them?

Lets start at the beginning.
Many applications require processing large datasets, these datasets are often way too large to fit in memory.
Another thing that is common is that many applications do not require all of the records all the time.
Because of this ORMs like hibernate support lazy loading, you pay with an extra fetch for the benefits of a faster initial query and less memory usage.
When a proxyed object is accessed, their data is loaded into memory and remains there until some explicit action is taken.
in any case, a proxied object will remain in memory as long as their session is active, and they are attached to it.

Now, lets look at the case of an application like the one i am currently implementing, i have a very large dataset that i wish to present to my users.
They can only see and use a small portion of the dataset at a time, but they need to be able to access all of it.
Our interface of choice is a single data table, and our record counts are hundreds of thousands for the average case and millions for the larger sets.

Loading 100k records into memory, while possible takes about 24 seconds on my machine, and consumes about 100mb of ram.
Obviously 24 seconds is too long for a user to wait until they get their data (There are several sets of 100k, which we switch between) and 150 mb is waay to much as it doesnt scale for our worst case, on an average user's machine.

Its not hard to imagine that many other applications have similar requirements - handle large volumes of data without requiring insane amount of memory and ages to load.

The classical solution to this problem is pagination - we divide our dataset into segments and execute a query to load the required page.
This solution works pretty well in many cases, but in my particular case unless the pages are very small, page loading will be visible (And annoying) to the user.

So the problem is this - how do i load data only when it's required and also allow it to be released when its not currently being used.
I would also like the solution to be as simple as possible, and require as little management as possible.

So, after thinking about it for a while i came up with this:
Create an object that will hold some way of retrieving the full record (non lazy).
Hold a WeakReference in that object pointing to the actual object.
Expose a property that will check the reference, return the instance or reload it if its not there.

The solution looks like this:

class WeakWrapper<WrappedData>
long id;
WeakReference reference;

public WrappedData Data
WrappedData result;
//check if something is set in our reference
if (reference != null)
{ //we have - lets see if it is valid
result = reference.Target as WrappedData;
if (result!=null)
return result; //return the data.
//no reference, load our data from the db.
result = LoadData();
//set the reference
reference = new WeakReference(result);
//return the result
return result;
/// <summary>
/// This method loads the data from an nhibernate session
/// using the record id.
/// </summary>
/// <returns></returns>
private WrappedData LoadData()
ISession session = ActiveRecordMediator
//load all the entities

IList entities =
.Add(Expression.Eq("Id", id))
//release the session

return (WrappedData)entities[0];

This solution works relatively well, but when the DataGrid that displays this data is scrolled down it takes a noticable amount of time for elements to load.
The main reason for this is that elements are loaded individually.

To avoid loading them individually, we group the calls and then load all the elements in a single statement.
To acheive this i used a nice trick i found on Tomer Shamam's blog.
Tomer did implemented a similar mechanism using a custom collection and a caching mechanism.
He used the Dispatcher Thread's prioritized InvokeLater method to deffer the invocation of the fetch until several calls have been made, and only then load the data.
This method works much better then a single fetch method.
My version, using The WeakReference mechanism is implemented via a modified version of the WeakWrapper, and a Bulk Loader that fetches the record via the deferred loading operation.

My particular implementation uses nhibernate, and an "int" id.
these can be easily changed for any kind of fetch (in the loader's load method) and any kind of "id".

The modified wrapper looks like this:

/// <summary>
/// A class that holds a weak reference to an entity.
/// it requires an entity that implements from IEntity
/// (which contains a single property int Id)
/// </summary>
/// <typeparam name="EntityType"></typeparam>
public class WeakEntityWrapper<EntityType> : INotifyPropertyChanged
where EntityType : IEntity

private static BulkEntityLoader<EntityType> loader
= new BulkEntityLoader<EntityType>();
public event PropertyChangedEventHandler PropertyChanged;
private WeakReference _objectReference = null;
private int _objectID;

/// <summary>
/// create a new wrapper with the provided id.
/// </summary>
/// <param name="objectID"></param>
public WeakEntityWrapper(int objectID)
this._objectID = objectID;
public WeakEntityWrapper(EntityType entity)
//get the id
this._objectID = entity.Id;
//save the reference
this._objectReference = new WeakReference(entity);
public EntityType Entity
//check if we have an initialized reference
if (_objectReference!=null)
EntityType result =
_objectReference.Target as EntityType;
//check if we have a valid target
if (result!=null)
return result; //return the target.
return null; //nothing for now, but soon! muahahahah

{ //the loader got back to us -
//set the entity and tell whoever is listening
_objectReference = new WeakReference(value);
//notify anyone who is listening
if (PropertyChanged!=null)
.Invoke(this,new PropertyChangedEventArgs("Entity"));
/// <summary>
/// the persistant id of the object.
/// </summary>
public int ObjectId
get { return _objectID; }
set { _objectID = value; }


As you can see, the main difference is that the Load method has been externalized
and that we now have a "set" method (that will be called by the loader.
This method will notify the viewer that it needs to update itself again, and this time it will get a non-null reference.

The BulkEntityLoader uses ActiveRecord and NHibernate and looks like this:

/// <summary>
/// uses deffered actions to group loading operations.
/// this class is NOT thread safe.
/// it expectes to be called by a single (UI) thread.
/// </summary>
/// <typeparam name="EntityType"></typeparam>
class BulkEntityLoader<EntityType>
where EntityType : IEntity
protected static readonly ILog Logger =
//here we save all the deferred wrappers for later
private Dictionary<long,WeakEntityWrapper<EntityType>> deferredWrappers
= new Dictionary<long, WeakEntityWrapper<EntityType>>();

//flag to mark the that the deffered action should only happen once
private volatile bool _isDeferred = false;

/// <summary>
/// mark an entity for future loading.
/// </summary>
/// <param name="wrapper"></param>
public void LoadEntity(WeakEntityWrapper<EntityType> wrapper)
//check if a deffered action was already set in place
if (!_isDeferred)
//mark our flag
_isDeferred = true;
//no deffered action, create a new one.

//check if the wrapper is in our collection, if not - add it.
if (!deferredWrappers.ContainsKey(wrapper.ObjectId))
deferredWrappers.Add(wrapper.ObjectId, wrapper);
/// <summary>
/// Load all the entities from the wrapper dictionary.
/// </summary>
public void LoadEntities()
Logger.Debug("Starting deffered action.\n"
+" deferring "+deferredWrappers.Count+" elements");
using (new SessionScope(FlushAction.Never))
//open a session
ISession session = ActiveRecordMediator
.CreateSession(typeof (EntityType));
//load all the entities
IList entities =
session.CreateCriteria(typeof (EntityType))
.Add(Expression.In("Id", deferredWrappers.Keys))
//release the session

foreach (EntityType entity in entities)
//set the values
deferredWrappers[entity.Id].Entity = entity;
//remove all entities that were set
//mark the end of the deffered action.
_isDeferred = false;

So basically we collect all the calls we get until the Dispatcher decides to invoke us.
Once the dispatcher has time to invoke us, we process all the calls in a single query and set the property for everyone that requested an update.
Once set, the property will fire a PropertyChange event, that will inform the UI that it needs to reload these properties.

So now all that remains is loading a list of ids from the database, and setting the collection in a DataGrid.

Friday, November 27, 2009

The Strong Type Inheritance Pattern

Generics were originally designed to help avoid very common mistakes while coding with various simple structures. The classic examples for this are Lists and HashMaps.
Employing these objects would result in confusion in the best case, and abuse in the form of multi-type elements in the worst case.
Generics are a an elegant solution for this problem - A Simple List<MyClass> makes a world of difference in both readability and usability of the code. It will save you both a cast and the occasional ClassCastException.

So what else are generics good for?
I think they open up new and interesting design patterns, one of may favorites is the Strong Type Inheritance Pattern (i am unaware of a different name for it).

The Strong Type Inheritance pattern is a simple pattern that allows you to make use of an inheriting type's "Type" to create typesafe methods that would otherwise require more complex, less efficient and more error prone reflection.

The example below shows a simple serializer class that provides serialization to inheriting types in a typesafe way. it also uses the type name as the file name.

/// <summary>
/// This is a save/load from xml facility.
/// it uses the class type as the file name,
/// and so can only persist a single instance of each type.
/// </summary>
/// <typeparam name="InheritingType">
/// The type of a class inheriting from this base class
/// </typeparam>
public abstract class SelfSerializingBase<InheritingType>
where InheritingType : SelfSerializingBase<InheritingType>
/// <summary>
/// Load the inheriting type from an xml,
/// use the type name as the file name
/// </summary>
/// <returns></returns>
public static InheritingType LoadFromXml()
//create a serializer for the inheriting type
XmlSerializer serializer =
new XmlSerializer(typeof(InheritingType));
//get a stream to load from
FileStream stream =
new FileStream(typeof(InheritingType).Name, FileMode.Open);
//deserialize and return the result
return (InheritingType)serializer.Deserialize(stream);
/// <summary>
/// Save the current instance to an xml.
/// Use the type name as the file name.
/// </summary>
public void Save()
//create a serializer for the inheriting type
XmlSerializer serializer =
new XmlSerializer(typeof(InheritingType));
//create a new file
FileStream stream =
new FileStream(typeof(InheritingType).Name, FileMode.CreateNew);
//serialize ourselves to the file.
serializer.Serialize(stream, this);

/// <summary>
/// A freeloading child that gets everything for free
/// and doesn't do anything itself.
/// The code for the child is even shorter than our summery!
/// :)
/// </summary>
public class SerializingChild : SelfSerializingBase<SerializingChild>
public int SomeProperty { get; set; }

static class TestClass
/// <summary>
/// An example of how we would use the methods from the parent.
/// </summary>
public static void TestSerialization()
//create an instance
SerializingChild child = new SerializingChild();
//set our property
child.SomeProperty = 10;
//serialize the instance
//load the instance
child = SerializingChild.LoadFromXml();


Lets examine the base class definitions, which is what this pattern is all about:

public abstract class SelfSerializingBase<InheritingType>
where InheritingType : SelfSerializingBase<InheritingType>

So what we have is a base class that has a seemingly cyclic definition.
It expects a generic type , and it also expects this generic type to be an extender of itself.
The intention here is to say that inheriting types should pass their own type if they wish to use use these facilities.
An important part of this pattern is the limiting "where" clause that requires any type that wishes to use this base class to use a child which inherits from this base class. This limit is not required by the functionality - this class could easily have been a "serializer" facility that provides a load/save functionality, but in our design we wish to mandate that ONLY inheriting types are allowed to have access to this functionality.
The Generic magic here is that we get a static method of any extending type which is type-safe and implemented just one.

Ok, so we can use the type of an inheriting class and we force this type to inherit from the base - but what is it really good for?
A good example for using this pattern is in castle's ActiveRecord. ActiveRecord provides various database access facilities to inheriting classes, it is effectively a thin wrapper across NHibernate's functionality ( - in this implementation, ActiveRecord itself is an interesting and useful pattern for persistent objects).
This is only a partial example however, because castle's ActiveRecord source code does not include the "where" clause, which should be a part of the pattern - since the purpose of the clause is to inform (and force) the developer using this class that the intention of the framework is that you extend this class.
If a developer misunderstands the design of a base class implementing a Strong Type Inheritance they will get a complier error and hopefully read the documentation and change their implementation to comply with the intended design.

One important point we must pay attention to when we use this pattern is that the generic type is expected to be the FINAL type of the entity, if we wish to extend the base class with another class that adds functionality but is still abstract - we must add the same InheritingType to its class signature. once we set an actual type to the generic definition, we sealed all future generations from using the base class's generic facilities.
For example - if we happen to extend SerializingChild with say YetAnotherChild, we are exposing a "save" method that can only save the components of a SerializingChild. This is a strange an unexpected behavior for a developer using YetAnotherChild and will likely lead to abuse and inevitably to increased cost of maintenance.

Friday, July 31, 2009

Typesafe abstractions with .net generics

The common example for generics in .net (and java) is usually for typesafe collections and typesafe operations. Safety in the workplace is very important so it's a good idea to use generic collections whenever possible. However generics, especially in .net (java generics are mostly fake) can be very powerful things to incorporate as part of our design (Made up fact - did you know that catching a ClassCastException is the no.2 cause of head injury among developers? The no.1 cause is NullPointerExeption!).

So the first example i want to talk about is of using generics to help us define a clear relationship between an abstract base class and it's children.

Generally speaking, we use an abstract base class when we have some common responsibilities for all the children, but the base class itself lacks some functionality and cannot be instantiated.

We can use a generic type definition for the abstract base to further refine this relationship and provide the inheriting classes with a type specific method signature they need to implement.
Without generics we'd be stuck with the lowest common denominator (just like on the tonight show).

So here is an example of some abstract base class that does some work, and delegates the type specific work to the children:

public abstract class AbstractProductCreator <SourceType,TargetType>
where SourceType : Material
where TargetType : Product
public TargetType CreateProduct(SourceType material)
{ //let's call this "buisness logic" :)
//get the product from the inheriting child
TargetType product = ProcessMaterial(material);
//do some more "buisness logic"
//return the completed, type specific product!
return product;

private void PreprocessMaterial(Material material)
//umm..lets say we melt it...
private void FinalizeProduct(Product product)
//put it in a shiny box?

//the only method our children will have to implement
public abstract TargetType ProcessMaterial(SourceType source);
//without generics this method signature would be:
//public abstract Product ProcessMaterial(Material material);


So our AbstractProductCreator does some type specific work with the generic level he knows and hands it to the child for the specifics, and then finalizes it before returning.

our child only needs to do what it is supposed to, and to define its own rule (what gets converted into what).
Here's an example of a famous inheritance of this base class:

class IPhoneCreator : AbstractProductCreator <RainbowsAndBunnies,IPhone>
public IPhone ProcessMaterial(RainbowsAndBunnies rnb)
//this is the actual implementation of the IPhone creation process.
return new IPhone(rnb.GetBunnies(),rnb.GetBunnies());
//its, um..copyrighted i tell anyone you saw it here..


Once we define our child, we completely define the scope of our responsibility.
Visual studio's Intellisense will also implement the skeleton of the method for us if we perform the right ceremony. (right click on the abstract class name+ implement methods if i recall correctly).

The downside of this pattern, and its an unfortunate common pitfall of generics in general - is that once you start defining something in generic terms, it tends to infect your application.
Anyone who uses an instance of our children doesn't have to worry about the generic type definition, but usually some other layer of our application that orchestrates a more fundamental mechanism doesn't really care about the specific type and wants to treat everything in more abstract terms.

This layer may want to have access to the AbstractProductCreator, but cant because this layer will have to define a variable that looks something like:

AbstractProductCreator <SomeMaterialtype,SomeOtherMaterialType> creator;

This sort of thing is really very far from what we originally intended.
So what we want is some way of abstracting things for the lower layers of the application into the most general terms they need to know about while maintaining type specific implementations in the higher layers.

So, how do we solve this problem?

There's a nice method i came up with after thinking about this problem for a few days.(challenge: without reading on, try to think of a solution. code it (the first few attempts are likely to reach some dead end). if it can compile and run, let me know how long it took you to solve it).

The solution:
Our abstract base class can already invoke the typesafe methods of it's children. now we only need some way of invoking the abstract base without all the generic additions.
To do that we add a new interface!

public interface ProductMaker
public Product MakeProduct(Material material);

//Our base class can implement this interface and

public abstract class AbstractProductCreator <SourceType, TargetType>
where SourceType : Material
where TargetType : Product

public Product MakeProduct(Material material)
//make sure someone is not trying to hand us the wrong kind of material.
if (!material.GetType().Equals(typeof(SourceType)))
throw new Exception("Type " + material.GetType() + " not supported by" + this.GetType());
//the only cast ugliness we introduce.
return this.CreateProduct((SourceType)material);

public TargetType CreateProduct(SourceType material)
//same as before..


Now the lower layers can refer to the abstract base through the interface, and not worry about the Generic type definitions at all!



Without generics we'd be lost. No generics means we can either do the type specific implementation and lose the abstraction, or use the abstraction but then no one can use the type specific and all the inheriting classes have to do the casts themselves.

I hope you enjoyed this article,any comments/questions/criticism/praise/award nominations are welcome as usual.

On the next episode i'll give an example of how to extend this concept into a constructing a binding layer that allows both the lower and upper layers to be separate and type specific.
I will also be exposing the manufacturing process of the mac book air.

Tuesday, July 28, 2009

Thinking outside the checkbox

I was talking to dan - a good friend of mine, and he was telling me a horror story about purchasing a shiny new netbook. He was overall very pleased with the hardware: the size, noise level, price and performance all suited him perfectly for his application - a battery backed webserver he can put in a closet and forget about. The problem was with the operating system, or rather the flood of popups, warnings, wizards and welcome messages that took a good 15 minuets to clear.

You see, dan is by no means a novice when it comes to using computers. He spent a few years doing system administration and is one of the best developers i know. But he spent the last couple of years using a macbook and probably more then that since his last windows installation. He has long forgotten what it feels like to configure a windows machine after it was installed.

When you install office 2007 there is some live plugin that can be installed. if it is installed you are faced with a welcome screen. you have a checkbox you can check if you dont want to see this message again. then you have two options - click ok, which supposedly also takes you through some configuration, or click cancel.

What do you do?

If i check the checkbox and cancel, will it show up next time? if i click ok, do i have to go to another annoying popup?
In my case it turns out neither option workes to disable this popup because a bug in the installation process failed to create the registry entry for this plugin. Since the checkbox was never saved it kept popping up no matter what combination of checkbox+ok/cancel i used.

I am a great believer in keeping things simple. it doesn't matter if you are designing a framework, designing your UI or writing a document. Things should be as simple as possible.
Solving a complicated problem in a complicated way is easy. solving a simple problem in a complicated way is easier. As a common rephrasing* of H.L Mecken's quote goes -"Complex problems have simple, easy to understand, wrong answers". This is why i try to keep things as simple as possible. The challenge is to keep the simple things simple and complex things as simple as possible.

Why am i telling you all this? Because i think that it's the lazy and indecisive designer that leaves all the options open and visible. I think they are lazy because they didn't take the time to think about how they can simplify things down to their core essentials. I think they are indecisive because they make someone else choose instead of researching and thinking about the problem enough to reach a decision themselves.

If i was designing an application with a "welcome" or "splash" or any other annoying popup screen, and 99% of the users would - as their very first user experience with my application - click on a checkbox that means "stop annoying me" and cancel, i didn't go that extra mile.Or lets be honest, that extra inch. I think that if every developer who creates such a page received a dollar whenever someone actually read the welcome screen and lost one whenever they check the box and cancel we'd have some very poor developers and very few of these messages.

I think the key is to design for simplicity.To try not to popup to many windows, make as many decisions as you can for your users. Use everything at your disposal to limit the information presented to the user to the minimum, relevant information to the particular view/function of the frame they are looking at. If a view deals with more then one "topic" or "issue", it should probably be separated. I also think one must always consider the limited nature of a checkbox before using it as the means to answer a question.
Oh, and try to avoid the "we'll make it floating/detaching/docking/toolbox" type solutions - these are usually the lazy indecisive type solutions. most users will never change your default setup. they don't care that they can configure and change almost anything. they just want to use it without spending hours configuring it to be usable.

Another thing i think desktop application developers should do is take some notes from the online world. These days many websites track very carefully how users get to their site, where do they go, where do they exist at,they follow every user's clickpath through their website, they track ad clicks, and any other statistic they can receive. This information is easy to get and can be very useful.

In the offline world this sort of thing is a bit more complicated. i did a surface search to see if there are any common frameworks for tracking this sort of thing, or any common methodology for this and i couldn't find one.
Wrapping all your controls in UI usage statistics gathering can be a bit of a hassle, but i would expect the leading providers of UI controls to pick up the glove and implement such mechanisms, along with their analysis tools. Frameworks that help create clean and usable UI will add great value to application developers.
I think developers of desktop applications should use their QA and beta testers to gather this sort of statistics btw - don't ask your users- it annoys them, and it makes them wonder how exactly you will be tracking them.

I'd like to add that usage tracking is to UI usability as profiling is to performance - it helps you solve problems and improve the situation, but it only solves a problem you already created.

As for dan, well - he did what he had planned from the start - he installed ubuntu. you can visit his netbook here.

* - the actual quote is "There is always an easy solution to every human problem—neat, plausible, and wrong."

Tuesday, March 31, 2009

Blast From The Past

A long long time ago, in the days before the Weird World Web. In a time of BBS’s and modems. A time when viruses roamed every floppy disk and computers were IBM Compatible.
In that faraway past just before the last decade of the 20th century a game was created by Activision.

This game was far ahead of it’s time. So far in fact that since its creation it was never again duplicated. Many tried, none succeeded.

I am speaking of course about Death Track (cue insane laugh).

You may find yourself thinking: "Death Track?... Death... Track? Really? I mean Death and Track?? "
Yes. Really!

I know the name is pretty lame (lame is also pretty lame)* but the game is not. You see, Death Track had everything. It had cars with cool and exciting weapons, it had contract killing and it had upgrades.

Ok, let me explain:
Death Track was the first Racing+Shooting game i ever played. I have played all racing+shooting games i could get ever since. No game was ever like Death Track. It may just be the nostalgia talking but i really think that at least until this point in time, it has never been surpassed in it's genre.

Many games since added some element of weaponry or destruction, but no racing game i know was solely dedicated to killing off your opponents. To facilitate this the game offered a wide array of weapons:
Machine guns, Lasers, Mines, Missiles, Ram Spikes and more. All the weapon systems in the game were upgradable. This was a very important element in the gameplay. you always wanted more: better weapons, better armor, faster engine. There were plenty of things you wanted to upgrade. Some helped you make money by winning races while others helped you win races by killing off the competition.

Of course, the competition didn't wait around for your to kill it. They would go after you as well. The game also featured contract eliminations that won you large sums of money if you succeeded. These were substantial enough to focus on killing your target even if you didn't get 1'st prize from winning the race.

I think that the upgrade elements were an important part of what made it so addictive. The game managed to always keep you wanting more, and you got to feel a real sense of acheivement when you got that next upgrade. And it payed off too. There is nothing more satisfying than blowing up your nemesis with your new tracking missiles.

I wonder why this was never done since. There are plenty of racing games around, and plenty of racing game engines. Many of which feature arcade to ultra-realistic damage systems. I would really love to see an arcady Need For Speed based game combining weapons, upgrades and total destruction.

On that note i would like to say that many of the console racing games have this element of certain upgrades opening up only after you played part of the game. I think this is really the wrong way to keep players interested. Money is a great way to achieve this: Its clear and its a simple way to give you multiple alternatives. If you feel that the only thing stopping you from getting to that next upgrade is X amount of $ its a whole other thing then knowing you need to complete the next stage or circuit. The fact that you have to comparmise and think about what to upgrade adds certain elements longer term strategy.

In many games you don't even know what's available. Give me the option to buy things and make the purchase worth-while. Let me feel i get an edge, a sort of "cheating" advanted over my oponents that really makes a difference in the game. The best way to keep gamers interested is by dangling the possibility right in front of them, and allowing them to choose how they get it.
If i want to play that same level 10 times on an easier setting to get the money for the upgrade i want - give me that option.

Its now twenty years since Death Track came out, and it seems like i should maybe let go of my dream of playing a game like that again on modern hardware, engine and graphics. But if there ever was an industry that keeps recycling the classics its the gaming industry.

My only fear is that it comes out exclusively for PS3...


Mmmm....I have just been informed that i am an idiot. turns out that in the very link i added was the link to the New version of Death Track!
I am downloading the demo now. Thanks Dan, I Can't tell you how exciting it is to finally play Death track with modern graphics.

* - That's a triple lame sentence. This is the first time such a fete has been attempted. Do not try this at home.

Monday, March 30, 2009

Trivial Statistics

So we got the new Scene It? Box Office Smash for the xbox 360.
We really like trivia games and we especially enjoy the Scene It series (we also have Lights, Camera, Action).
We like the subject, but much more than that we enjoy the presentation. Questions like anagrams, pictogram, skewed images that gradually clear, sound bites that require that you pay attention to the details etc'. All in all its a well made game with original question presentation. It also has the advantage of being one of the too few 4 player games (on the same xbox).

However, there is a serious downside to the game and it has to do with the way questions are selected.
In Scene It? Lights,Camera, Action there were 1,800 new questions (For some reason this information on Box Office Smash is hard to track down). Unfortunately we never got to experience all these questions because relatively quickly we started noticing questions repeating. This got me thinking: there are 1,800 questions and there's no way we played long enough to see 1,800 questions, why are we seeing repeats?

The cause as it turns out, is statistics. Specifically a phenomenon similar to the birthday paradox.
Quick review for those who are not familiar with the birthday paradox and can't be bothered to read the wiki page:
If you sit in a room with a group of 23 randomly selected people (like a classroom) there's a 50 percent chance 2 people in the room have the same birthday.
for 57 people there's >99% chance for two people to have the same birthday.
To get the general idea why your intuition (that this is sounds wrong) is wrong, think of the 2 kids in your elementary school or high school class that had the same birthday.
Oh, and think of how many people you know that also had 2 kids with the same birthday in their class (hint: you went to school with them :) )*.

How is this the same for a game that randomly selects questions?
Without going into the math too much lets assume you played the game for a little while and you saw 180 questions (10%).
At this point, the probability that the next question is a question you already answered is ~1/10. This already is high enough to be discouraging. Worse yet, the expected number of repeat questions until you reach 180 questions is ~8.4! (because of the birthday paradox).
Now lets assume you played long enough to answer 600 (~33%) questions in the game. Out of the 600 question ~80(!!) are expected to be repeats. Needless to say, if you saw a third of the questions (you need to answer more then 600 questions for that) then one in every three questions will be one that you saw already.
At 1/10 its annoying. at 1/3 the game is unplayable. So everybody loses. You enjoy only about a third of the value you thought you were getting from the game, and the game developer works very hard to create 1,200 more questions that you will never see.

The good news is that there is a pretty simple solution for this problem that works out well for everyone.
You see, the problem stems from the method of randomly selecting the questions. Fortunately there is a different method that both guarantees randomly selected questions and no repetitions. instead of "rolling the dice" every time we need to select a game we select a permutation of the sequence of questions (in simple terms - we randomly rearrange the sequence of questions) and save this permutation. This guarantees that you will see all 1,800 questions before you see a question you know.
Since 1,800 is a decent number of questions there's a good chance that you will forget the earlier questions in the sequence. But even if you don't - at least you get to enjoy all of the questions in the game.

BTW: On Feature listing for the game mentions a "minimal repeat" feature for the game which keeps track of questions answered to minimize repeats.
I don't know how that feature is implemented, but i can tell you from experience that it doesn't seem to work very well as we saw plenty of repeats. Besides, there's no reason the number of repeats should be anything other then zero.

Oh, and as for the issues of online play and playing against different players etc'. There are solutions to all of these problems (generate a permutation for the unseen subgroup on all player's lists, you can also add a timestamp to the question to ensure a 'long' time between repeats). I know it's not ideal and i know that there are details to work out. But i believe that this is a core issue for any trivia game and a great game needs to give the very best possible solution.

One final point, and i may be way off on this one. I think if i knew that i was running out of questions or if there was some way of indicating to me that i saw most of what's available, it would encourage me to get the question packs. It might be nice if this was done automatically (you only have 200 questions left, why not try the XYZ expansion).

* - yes, yes, i know *technically* its statistical lie. But what better way to fight faulty statistical intuition than faulty statistical intuition?

Monday, March 23, 2009

Amazing!!! The Future of cell phones is here now!!!!

Check this out, it’s this amazing invention! The MODU!
It’s a cell phone, right?

And, get this – it comes in different shapes!!!!

Its so cool and innovative!!!!
I mean, think of the possibilities: you can have like , a yellow slider pone, or like an angular little phone, or this tiny thing with no numbers on it, or like a crazy psychedelic phone with like really small buttons!!!

They will make millions! Billions even! Of possibly dollars!
And you will never believe the specs! Its 2.5 Gen (like the amazing non 3g version of the iphone 1.0)
It has Bluetooth (like the iphone)
It doesn’t have a built in GPS (like the iphone non-3g)
And it has 2 Gig of flash (less (thus lighter) then the cheapest version of the first iphone)!!!
You know, it amazingly also costs like half of what the iphone costs, plus you get 2 jackets! So it’s like 4 times better! Actually you can use the modu without the jacket, so its like 6 times better! So its even mathematically proven to be superior! With numbers!

And if that’s not enough, they have a touch screen. But unlike the silly iphone, it’s not going to do things when you touch the screen, only when you touch the buttons. I mean, the iphone hardly HAS any buttons, the modu has 7 buttons on the tiny version alone! It’s like, obviously if you want to do something you click the BUTTONS.

To show how amazing and cool the modu is, I have made this table of comparison.
Using this table you can clearly see the superiority of the Modu!

Amazing Modu

Silly Iphone







Guinness world records



Waste battery on WiFi



Waste battery on GPS



Can become a car radio



Total awsomness



* - math is for silly iphone geeks.

Oh, full disclosure - i have one of those silly iphone things.

Thursday, March 19, 2009

War of the Worlds

H.G Wells got the title right. Indeed there is a war going on. A War between Worlds. War between the world of man and the aliens that live amongst us.

This is the story of one man’s struggle in a battle against the odds to defeat the invaders from another world.

It all started years ago when I was living in an apartment not very far from the botanical gardens in Jerusalem. i suspected nothing when I moved in. There was no sign. No declaration of war. No warning shot. They always came at night, just as I was getting to sleep. They waited, sticking to the walls, biding their time. Suddenly – bang! One of them would create a diversion. I did not realize this at the time, but they wanted to get me out of the protection of the covers. They wanted me to bring back the light so they could mark their target.

As soon as the light came on they would start flying around it, circling it in widening and narrowing circles. Looking for an opening to strike. When the light was off again they would send their kamikaze fighters to crash into me as I was trying to hide. There was nothing I could do, they were too many.

I decided to beat them at their own game, they will not outwit me with their mind games. I will prepare a trap for them! I opened a window, and lured them out with the porch light. As soon as they got out I would shut the window and turn off the light.

This worked for a while, but they finally figured out my little ruse. Instead of all going at once they would send a scout. If he got trapped they would immediately counter attack, smashing into everything!

Then I tried keeping the window shut all the time. They really didn’t like that. I was sitting in my room one day, minding my own business, when suddenly I was hit in the back of the head. In the middle of the day no less!

I tought maybe i should try and reason with them: “Try to see it my way” I said, “Do I have to keep talking ‘till I can’t go on?”

They circled the lamp to show their agreement, so I continued:

“Think of what I’m saying, We can work it out, and get it straight or say good night”.

They obviously didn’t like the idea of night, as one of them did a flyby as a warning. I figured they must be angry about something. Perhaps because they were only here for a short time. Maybe they feared night time as every night brought them closer to their inevitable demise. I thought it was worth a shot , so i looked them strait in the eyes and in a calm and direct manner I told them:

“life is very short, and there’s no time for fussing and fighting my friend”.

This has to stop”. I was shocked! They can talk! Moths can talk!

“what do you mean” I said hesitantly.

It’s bordering on copyright infringement”.

“huh?” I was baffled.

The Beatles song. Listen, we’re from the RIAA.

“The RIAA??”

We are here to monitor file-sharing activity.

“Shouldn’t you be monitor internet traffic then?” For some reason this seemed like the proper response.

We tried that for a while, but they told us we should get out more, do some field work

“Field work?” this was getting stranger by the minuet.

Packet sniffing”.

“Packet sniffing?”

We have a very keen sense of smell, we are very small, and we can get into many places. Naturally, we are the best candidates for the job”. They sounded very proud.

"I see." i really didn't. "Well, why here then?"

"We heard reports about music coming from this location"

"Mmmmm... ok, But music is not illegal right?"

"You are replicating Copyrighted materials"

"No i am not, i am just listening to music!"

"So you admit it! The analog replication of digital music is prohibited under copyright law"

" am pretty sure listening to music is protected under fair use"

"Many do. They are wrong."

"So you're saying i am not allowed to listen to music?"

"Under U.S Copyright law making ten copies or more is considered a felony"

"So i can listen to a song ten times?"

"Technically nine, ten would be a felony."

"And this doesn't seem ridiculous to you?"

"We are talking Moths working for the RIAA, what do you think?"

Tuesday, March 17, 2009

On the importance of Garbage Collection

Update (21/10/2011):
It's been more then 2 years and it seems Infragistics have done nothing about this. They made a very idiotic design decision to sacrifice a fundamental behavior of managed code and then insist on not fixing it. Reader Jarrett posted an update that extends the fix below for win7 and Infragistics v10.3 and possibly 11.1 - I've added it to the code below.
Thanks Jarret!


About two years ago we found a nice UI package for WinForms called NetAdvantage by infragistics.
A very nice pack, lots of useful controls and not too expensive. Maybe a bit slow, but this is not a problem in most applications.

For a while all was good and well, we built our application and our clients were happy.
Then one day - disaster!
Our application started crashing!
This was after a rather long development cycle where we added support for many new features in the hardware (this was a management application for specialized hardware).
Worse yet, we only started noticing it in QA. The reason was that it only happened after sending configuration to the hardware about 10-15 times without restarting the app - something we almost never do in dev.

Well, when your app crashes after repeated anything its very natural to suspect a memory leak. So i carefully watched memory consumption while calling send configuration. Memory consumption was increasing on task manager, but this is to be expected in managed environments (the simple explanation is that new memory is allocated as long as possible before old memory is reacquired. the actual explanation is not very simple :) ) . Since the machine had 2 gigs of memory and the app only grew from about 150 MB to 200 MB before crashing i figured it was something else.

What i did notice during my exhilarating time with Task Manager was that User Objects were growing rapidly. Being originally of the java persuasion i was unfamiliar with these User Objects. Turns out it was to do with windows handles and the silly way win32 api does UI.
It also turns out that windows has a limit of about 10,000 open handles. That seemed like a lot to me and i didn't understand how we could even approach this limit so fast. Sure, we have a lot of complex UI elements, but still - firefox was no.2 in the object count on my machine and only had 600 handles. Our app had~3,000 just after startup. This didn't seem likely.

The thing was handle leak and memory leak were very similar under managed environment. we used the excellent dot Tracer to locate the source of our leak. After a few hours staring at rather complicated object graphs it turned out that the leak came from...well...i guess my opening was a dead giveaway... - Infragistics Controls!

Anyone who is a developer in a managed environment is well aware of the crimes associated with keeping references to One's object when they have been removed from scope. It's very easy to do it by accident. a Hashmap that wasn't cleared correctly, an errant thread that keeps holding your objects or misunderstood relationships with event handler. All these and more were explored during my search for the leak. All these were not the case with Infragistics.

No - they did it on purpose.
Huh? you say. How, or why, would anyone do this on purpose?
The short answer is "i don't know". The longer answer is that they wanted a mechanism that will allow them to change the look and feel if the user changed the windows theme while the program was running. To do that they hooked into some theme change events and maintained links to every control. The short answer still holds tho , because its a really bad excuse. As i pointed out to Infragistics, you can easily use WeakReference to completely solve this problem.

During this ordeal i was talking to Mr. Vince McDonald, the manager of dev support at Infragistics. it took me a very long time and many lengthy emails and an almost religious debate until i finally got him to admit it was indeed a design flaw. He informed me that "..we will not be able to apply any changesfor it as part of a hotfix" as "..any such changes would have a very high chance to destabilize our current systems". he also promised that "..changes we may make can only safely be done as part of a volume release".
Well, this sounded reasonable to me so i decided to wait with this post. However 3 volumes have been released since my original conversation with him and the problem has not been fixed.

Despite a complete lack of support from Infragistics, we were finally able to resolve the issue ourselves. We had to use a rather nasty trick to bypass their mechanisms.
Basically we reflected their objects and forcefully cleaned all the references.
The code to resolve the problem is posted below.
Important note - This fix was prepared for v7.3 and may work with 10.3 (Thanks Jarrett) I dont know if it will work for any other version. I Suspect it might, and i would appreciate it if you tried it on a different version and it worked for you.

using System;
using System.Collections.Generic;
using System.Text;
using System.Windows.Forms;
using Infragistics.Win;
using System.Reflection;
using Infragistics.Win.AppStyling;
using System.Diagnostics;
using Snoops.UI;
using System.Collections.Specialized;
using System.Collections;

namespace CleaningServices

///  THIS HACK WORKS ONLY WITH Infragistics V7.3
///  AND POSSIBLY 10.3 (Thanks Jarrett)
/// <summary>
///  ReleaserControl is a special control designed to rid our code of the ugliness introduced by using the Infragistics UI Package.
///  Unfortunately someone in Infragistics made the "Design Descision" (according to their support personal) to turn all their UI objects into unmanaged object.
///  They acheived this by strong binding every control to static events held by static classes deep within the infrastructure.
///  As a result, if a UI class holds an infragistics control within it's Controls list, it will never get garbage collected.
///  Infragistics controls NEVER get garbage collected, unless Dispose is explicitly called on the control.(the Dispose method releases the even bind).
///  WTF??? you say, well, yes. since every control holds a strong reference to Parent, and the control itself is held
///  by a static event handler, your controls will not get collected when they get out of scope, if they hold an infragistics control.
///  this will naturally cause a memory leak. this in itself is not the worst of it. since all controls hold a Windows Handle,
///  your app will crash with the lovely "Cannot Create Window Handle" exception somewhere around 10,000 objects.
///  So, how did we solve this problem? well, we didnt REALLY solve it. we did a hack.
///  this seems to work for our current usage of the package, and i did not notice any unexpected behavior in the UI controls.
///  the limitation of this fix is that it was created for the Infragistics package v7.3 and might not work for any other version.
///  that's the way hacks go i'm afraid..
///  What does our hack hack do? well, it forces unregistration of infragistics controls from all the static places we could find them binding to.
///  calling the InfragisticsCleaner.ClearEventBinding(); causes all events for all controls to be unregistered, allowing them to get GCed as they should.
///  the price we pay is that some windows events will not be handled this way,(like changing themes) , but we are willing to live with that.
///  There might be other implications..we dont know.
///  for ease of use we added the ReleaserControl that extends UserControl. all it does is add an event handler
///  which calls the InfragisticsCleaner.ClearEventBinding(); method whenever a control is added.
///  extend this class in your user control to allow your controls to get GCed again.
///  a once in a while thread calling the ClearEventBinding method would probably work just as well, but it might not get called
///  exactly when you need it...
/// </summary>

public class ReleaserControl : UserControl
public ReleaserControl() : base()
this.ControlAdded += new ControlEventHandler(ReleaserControl_ControlAdded);
void ReleaserControl_ControlAdded(object sender, ControlEventArgs e)
public class InfragisticsCleaner
private static StaticPropertyHolder[] properyHolders = new StaticPropertyHolder[]
new StaticPropertyHolder(typeof(StyleManager), "styleChangedDelegate", "StyleChanged"),
new StaticPropertyHolder(typeof(Office2007ColorTable), "colorSchemeChanged", "ColorSchemeChanged"),
new StaticPropertyHolder(typeof(Office2007ColorSchemeChangedNotifier), "colorSchemeChanged", "ColorSchemeChanged"),
new StaticPropertyHolder(typeof(RoleSelectionUI),"queryComponentRoleDelegate","QueryComponentRole"),
new StaticPropertyHolder(typeof(XPThemes), "themeChangedDelegate", "ThemeChanged") 

public static void ClearEventBinding()
foreach (StaticPropertyHolder holder in properyHolders)

class AcessibleTextManagerCleaner
ListDictionary[] dictionaries;
static AcessibleTextManagerCleaner cleaner;
//initialize the static cleaner instance.
static AcessibleTextManagerCleaner()
String[] propertNames = new String[]{ "SubclassList", "ControlList", "EditorList" };
cleaner = new AcessibleTextManagerCleaner(GetAccesibleTextManagerType(), propertNames);
public static void CleanDictionaries()
foreach (ListDictionary dictionary in cleaner.dictionaries)
private AcessibleTextManagerCleaner(Type accessibleTextManagerType, String[] dictionaryPropertyNames)
//retrieve the singleton instance.
PropertyInfo instanceField = accessibleTextManagerType.GetProperty("Instance", BindingFlags.NonPublic | BindingFlags.Static);
object instance = instanceField.GetValue(null, new Object[0]);
//get the properties
dictionaries = new ListDictionary[dictionaryPropertyNames.Length];
for (int i = 0; i < dictionaryPropertyNames.Length; i++)
dictionaries[i] = ReflectListDictionary(dictionaryPropertyNames[i], instance, accessibleTextManagerType);
//get the dictionaries to clear.
private static ListDictionary ReflectListDictionary(String propertyName, Object instance,Type type)
PropertyInfo instanceField = type.GetProperty(propertyName, BindingFlags.NonPublic | BindingFlags.Instance);
ListDictionary list = (ListDictionary)instanceField.GetValue(instance, new Object[0]);
return list;

private static Type GetAccesibleTextManagerType()
//we need the infragistics assembly that holds our type (its a private type so we cant access it directly)
Assembly infragisticsAssm = Assembly.GetAssembly(typeof(EditorWithMask));
Type[] types = infragisticsAssm.GetTypes();
Type accessibleTextManagerType = null;
//search through the assembly until we get what we want.
foreach (Type t in types)
if (t.Name.IndexOf("AccessibleTextManager") != -1)
accessibleTextManagerType = t;
return accessibleTextManagerType;

class StaticPropertyHolder
EventInfo eventInfo;
FieldInfo field;

public StaticPropertyHolder(Type type, String delegateFieldName, String eventName)
this.field = type.GetField(delegateFieldName, BindingFlags.Static | BindingFlags.NonPublic);
eventInfo = type.GetEvent(eventName);

public void ClearEvents()
Delegate eventDelegate = field.GetValue(null) as Delegate;
if (eventDelegate != null)
Delegate[] invocationList = eventDelegate.GetInvocationList();
foreach (Delegate del in invocationList)
eventInfo.RemoveEventHandler(null, del);