# Evolution of Solutions - and Perceived Performance

I entered a small competition yesterday and wanted to write a short post describing my progress and findings. The competition was rather simple, write a method that returns an array containing the numbers from 1 to 1000 in a random order.

My first solution was the naive LINQ solution.

1 2 3 4 5 6 7 8 | public static int[] NaiveLinq(int max) { var random = new Random(); var query = from number in Enumerable.Range(1, max) orderby random.Next() select number; return query.ToArray(); } |

I reckoned that several people would post this solution, so I felt like doing something slightly more fancy - and since parallelization is so hot these days, why not do it with PLINQ instead, so it would actually be faster on multi-core systems. The change is really simple.

1 2 3 4 5 6 7 8 | public static int[] PLinq(int max) { var random = new Random(); var query = from number in ParallelEnumerable.Range(1, max) orderby random.Next() select number; return query.ToArray(); } |

I benchmarked it (on my quad-core machine), and saw that the PLINQ solution was indeed faster - but only for rather big solutions, the overhead was simply too big in small instances of the problem (array size < 3000). This is not too well for a competition that is supposed to make arrays of size 1000. However, reluctant to ditch my parallel idea, I made a hybrid solution, which would use regular LINQ for small instances and PLINQ for big instances, based on my benchmark:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 | public static int[] GenerateUsingHybridLinq(int arrayLength) { var random = new Random(); // Use PLINQ if we're above the "very scientific" limit of 3000. if (arrayLength >= 3000) { return (from i in ParallelEnumerable.Range(1, arrayLength) orderby random.Next() select i).ToArray(); } return (from i in Enumerable.Range(1, arrayLength) orderby random.Next() select i).ToArray(); } |

I posted this solution to the competition and decided I had spent enough time on it. However, it bugged me because I knew that the `Random`

class in the .NET framework is not guaranteed to be thread-safe. It also bugged be that I was actually solving a shuffling problem by sorting. Shuffling can be solved in O(n) while sorting is O(n*log(n)). In addition I thought that the LINQ solutions were kind of dull.

So I decided to take a stab at solving it without resolving to sorting. Shuffling is a well known problem, so I implemented the Fisher-Yates algorithm, often used for shuffling cards. It is actually a rather elegant algorithm.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | public static int[] Generate() { Random random = new Random(); var numbers = new int[1000]; // Add sorted numbers to shuffle for (var i = 0; i < 1000; i++) numbers[i] = i + 1; var last = numbers.Length; while (last > 1) { // Select a random entry in the array to swap var swap = random.Next(last); // Decrease relevant end of array last = last - 1; // Swap numbers using XOR swap, we don't need no stinkin' temp variables if (last != swap) { numbers[last] ^= numbers[swap]; numbers[swap] ^= numbers[last]; numbers[last] ^= numbers[swap]; } } return numbers; } |

I adjusted it a bit and finally found a place to make an ode to the awesomeness that is XOR swap, swapping two values without using a temporary variable. Even though the algorithm was faster asymptotically, I was curious how it would venture against the sort-based LINQ solutions performance-wise. Here is the result:

Note that both HLINQ and PLINQ use all 4 cores on my quad-core machine. I realize there‘s an overhead using LINQ, but I’m still impressed how much faster this simple little algorithm is.

I submitted my new solution, with a note to replace my hybrid solution, just before midnight, but unfortunately I think it might have been too late, it didn‘t seem to make the cut for the competition. At least my hybrid solution made it into the final round - and non-thread-safe Random surely won’t be a problem for instances of size 1000 - since it will use regular LINQ for that.