Randomsort
I often need to randomly sort a file. I’m not aware of any standard bash or GNU command that does this, so I just wrote this very short script. There may be an even shorter/faster/more efficient way to do this, but I thought I’d post this as it might be helpful to a Linux newbie trying to accomplish the same task. Just put this in a file—e.g., “randomsort”; make it executable; and then pipe whatever you want to randomize into it (cat file_to_be_randomized | randomsort or randomsort file_to_be_randomized), and voila, you’re done.
#!/usr/bin/perl my @array = <>; while (@array) { my $element = int(rand(@array)); print $array[$element]; delete $array[$element]; }
Feel free to comment if you’ve got an easier solution.
Steve Laniel Jan 28
Note also that because of the way Perl works, you can just do “randomsort file_to_be_randomized”, and the effect is the same as piping.
Martijn Vermaat Jan 28
Sidenote: your script forgets to eat input on Planet Debian. I guess the <> characters get lost in the RSS pipeline.
Steve Laniel Jan 28
The <> characters get lost in your RSS feed, too. You probably need to pass through a filter first that will convert all the appropriate metacharacters to their escaped versions.
Steve Laniel Jan 28
Hmm. So my last comment — and presumably also Martijn’s — was supposed to contain the Perl left-angle-bracket-right-angle-bracket operator. It disappeared, even though I escaped it into ampersand-lt and ampersand-rt (as I presume Martijn did).
Adam Rosi-Kessel Jan 28
Thanks for the tips. I think I’ve fixed the entry and the comments–apparently my writeback plugin filters out the escaped brackets, but not unescaped brackets (which it turns into escaped brackets). I’ll fix it at some point, right now the entry and writebacks are corrected.
Michael Janssen Jan 28
Use bogosort!
golfer Jan 28
How about “print sort { rand() > 0.5 } ;” Not sure perl guarantees the block is called only once per pair though.
The same in Ruby: “print ARGF.readlines.sort_by { rand() }”.
Adam Rosi-Kessel Jan 28
perl -e “print sort { rand() > 0.5 } <>” file_to_be_sorted
That seems to work, but only “semi-randomizes” the file–lines move around a little bit, but a line that was near the end of the input stays near the end of the output.
Chung-chieh Shan Jan 28
Note that delete doesn’t shorten the array unless the removed element is at the end of the array, so your code spends more time printing (undef) than is necessary. Here is a one-liner:
perl -e ‘@a=<>;print splice@a,rand@a,1while@a’
Chung-chieh Shan Jan 28
… with <> between = and ;
Chung-chieh Shan Jan 28
… with a left bracket and a right bracket between “with” and “between”
Adam Rosi-Kessel Jan 28
(I fixed the brackets again)… Don’t know if it truly counts as a ‘one-liner’ with a semicolon in it, but your example is definitely the most efficient and compact I’ve seen so far.
Matt Knecht Jan 28
$ perldoc -q shuffle
…
sub fisher_yates_shuffle {
my $deck = shift; # $deck is a reference to an array
my $i = @$deck;
while ($i–) {
my $j = int rand ($i+1);
@$deck[$i,$j] = @$deck[$j,$i];
}
}
…
Leland Johnson Jan 28
Also, Tie::File treats a file as an array. I’m pretty sure all of it’s caching functions don’t really help in this case, but still would work well for randomizing humongous files.
perldoc Tie::File # in Perl 5.8+
http://perldoc.perl.org/Tie/File.html
Ari Pollak Jan 28
There’s actually a program to do this in Debian already, it’s in package randomize-lines.
salva Jan 28
here you will find a discussion about unsorting in perl: http://perlmonks.org/index.pl?node_id=450883
Shiar Jan 28
golfer/adam: that would only move values down. You want to return -1 as well.
The following should do:
print sort { int(rand 2)*2 – 1 } <>;
Shiar Jan 28
print sort { int(rand 2)*2 – 1 } <>;
Adam Rosi-Kessel Jan 28
Wow–thanks for all the tips. I never would have guessed this would be such a hot topic! If I’ve learned one thing, it’s that my blog comment script doesn’t handle brackets in code very well.