develooper Front page | perl.fwp | Postings from April 2002

Fisher-Yates shuffle

From:
Sean M. Burke
Date:
April 4, 2002 05:58
Subject:
Fisher-Yates shuffle
Message ID:
5.1.0.14.1.20020403222542.00a97940@mail.spinn.net
I was looking at the implementation of the FY shuffle that I see in 
Perlfaq4, namely:

     sub fy1 {
         my $deck = shift;  # $deck is a reference to an array
         my $i = @$deck;
         while (--$i) {
             my $j = int rand ($i+1);
             @$deck[$i,$j] = @$deck[$j,$i];
         }
     }

And I thought "guh, that looks a bit off" -- notably, it dies when given an 
empty array.
So I banged out the following, which I'm pretty sure does the same work:

sub fy2 {
    my($deck,$i,$j,$t) = $_[0];
    $i = @$deck || return;
    while(1) {
      $j = int rand($i-- || return);
      $t = $deck->[$i];
      $deck->[$i] = $deck->[$j];
      $deck->[$j] = $t;
    }
}
# Thanks to Silmaril for pointing out that the temp var is faster

And besides fixing the bug with empty arrays, it runs twenty-odd percent 
faster.
That ratio seems to hold true whether the deck is numbers, strings, or objects.

I was quite surprised -- it seems too good to be true.  Am I doing anything 
obviously wrong?


This is the benchmarker code:

use strict;
use Benchmark;
my @x; for(1 .. 1000) { push @x, rand(1_000_000) }

print "Size of deck: ", scalar(@x), "\n";
timethese(10000, {
      'way1' => sub { fy1(\@x) },
      'way2' => sub { fy2(\@x) },
});
splice @x, 100;
print "Size of deck: ", scalar(@x), "\n";
timethese(100000, {
      'way1' => sub { fy1(\@x) },
      'way2' => sub { fy2(\@x) },
});



% perl shuffler3c.pl
Size of deck: 1000
Benchmark: timing 10000 iterations of way1, way2...
       way1: 72 wallclock secs (72.06 usr +  0.00 sys = 72.06 CPU) @ 
138.77/s (n=10000)
       way2: 51 wallclock secs (52.13 usr +  0.00 sys = 52.13 CPU) @ 
191.83/s (n=10000)
Size of deck: 100
Benchmark: timing 100000 iterations of way1, way2...
       way1: 71 wallclock secs (70.85 usr +  0.00 sys = 70.85 CPU) @ 
1411.43/s (n=100000)
       way2: 53 wallclock secs (52.29 usr +  0.00 sys = 52.29 CPU) @ 
1912.41/s (n=100000)

And neither implementation seems to do anything crazy, like going into an 
infinite loop when given a one-element list, nor always returning ('b', 
'a') when given ('a', 'b').


--
Sean M. Burke    http://www.spinn.net/~sburke/




nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About