Randomsort

I often need to randomly sort a file. I’m not aware of any standard bash or GNU command that does this, so I just wrote this very short script. There may be an even shorter/faster/more efficient way to do this, but I thought I’d post this as it might be helpful to a Linux newbie trying to accomplish the same task. Just put this in a file—e.g., “randomsort”; make it executable; and then pipe whatever you want to randomize into it (cat file_to_be_randomized | randomsort or randomsort file_to_be_randomized), and voila, you’re done.

 #!/usr/bin/perl my @array = <>; while (@array) { my $element = int(rand(@array)); print $array[$element]; delete $array[$element]; } 

Feel free to comment if you’ve got an easier solution.

19 comments

  1. Steve Laniel Jan 28

    Note also that because of the way Perl works, you can just do “randomsort file_to_be_randomized”, and the effect is the same as piping.

  2. Martijn Vermaat Jan 28

    Sidenote: your script forgets to eat input on Planet Debian. I guess the <> characters get lost in the RSS pipeline.

  3. Steve Laniel Jan 28

    The <> characters get lost in your RSS feed, too. You probably need to pass through a filter first that will convert all the appropriate metacharacters to their escaped versions.

  4. Steve Laniel Jan 28

    Hmm. So my last comment — and presumably also Martijn’s — was supposed to contain the Perl left-angle-bracket-right-angle-bracket operator. It disappeared, even though I escaped it into ampersand-lt and ampersand-rt (as I presume Martijn did).

  5. Adam Rosi-Kessel Jan 28

    Thanks for the tips. I think I’ve fixed the entry and the comments–apparently my writeback plugin filters out the escaped brackets, but not unescaped brackets (which it turns into escaped brackets). I’ll fix it at some point, right now the entry and writebacks are corrected.

  6. Michael Janssen Jan 28

    Use bogosort!

  7. golfer Jan 28

    How about “print sort { rand() > 0.5 } ;” Not sure perl guarantees the block is called only once per pair though.

    The same in Ruby: “print ARGF.readlines.sort_by { rand() }”.

  8. Adam Rosi-Kessel Jan 28

    perl -e “print sort { rand() > 0.5 } <>” file_to_be_sorted

    That seems to work, but only “semi-randomizes” the file–lines move around a little bit, but a line that was near the end of the input stays near the end of the output.

  9. Chung-chieh Shan Jan 28

    Note that delete doesn’t shorten the array unless the removed element is at the end of the array, so your code spends more time printing (undef) than is necessary. Here is a one-liner:
    perl -e ‘@a=<>;print splice@a,rand@a,1while@a’

  10. Chung-chieh Shan Jan 28

    … with <> between = and ;

  11. Chung-chieh Shan Jan 28

    … with a left bracket and a right bracket between “with” and “between”

  12. Adam Rosi-Kessel Jan 28

    (I fixed the brackets again)… Don’t know if it truly counts as a ‘one-liner’ with a semicolon in it, but your example is definitely the most efficient and compact I’ve seen so far.

  13. Matt Knecht Jan 28

    $ perldoc -q shuffle

    sub fisher_yates_shuffle {
    my $deck = shift; # $deck is a reference to an array
    my $i = @$deck;
    while ($i–) {
    my $j = int rand ($i+1);
    @$deck[$i,$j] = @$deck[$j,$i];
    }
    }

  14. Leland Johnson Jan 28

    Also, Tie::File treats a file as an array. I’m pretty sure all of it’s caching functions don’t really help in this case, but still would work well for randomizing humongous files.

    perldoc Tie::File # in Perl 5.8+

    http://perldoc.perl.org/Tie/File.html

  15. Ari Pollak Jan 28

    There’s actually a program to do this in Debian already, it’s in package randomize-lines.

  16. salva Jan 28

    here you will find a discussion about unsorting in perl: http://perlmonks.org/index.pl?node_id=450883

  17. Shiar Jan 28

    golfer/adam: that would only move values down. You want to return -1 as well.

    The following should do:

    print sort { int(rand 2)*2 – 1 } <>;

  18. Shiar Jan 28

    print sort { int(rand 2)*2 – 1 } <>;

  19. Adam Rosi-Kessel Jan 28

    Wow–thanks for all the tips. I never would have guessed this would be such a hot topic! If I’ve learned one thing, it’s that my blog comment script doesn’t handle brackets in code very well.

Leave a Reply

(Markdown Syntax Permitted)