Autoincrementing letters in Perl

I do not understand autoincrementing letters in Perl.

This example seems perfectly understandable:

$a = 'bz'; ++$a;
ca #output

b gets incremented to c. There is nothing left for z to go to, so it goes back to a (or at least this is how I see the process).

But then I come across statements like this:

$a = 'Zz'; ++$a;
AAa #output

and:

$a = '9z'; ++$a;
10 #output

Why doesn't incrementing Zz return Aa? And why doesn't incrementing 9z return 0z?

Thanks!

Answers


To quote perlop:

If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^[a-zA-Z]*[0-9]*\z/, the increment is done as a string, preserving each character within its range, with carry.

The ranges are 0-9, A-Z, and a-z. When a new character is needed, it is taken from the range of the first character. Each range is independent; characters never leave the range they started in.

9z does not match the pattern, so it gets a numeric increment. (It probably ought to give an "Argument isn't numeric" warning, but it doesn't in Perl 5.10.1.) Digits are allowed only after all the letters (if any), never before them.

Note that an all-digit string does match the pattern, and does receive a string increment (if it's never been used in a numeric context). However, the result of a string increment on such a string is identical to a numeric increment, except that it has infinite precision and leading zeros (if any) are preserved. (So you can only tell the difference when the number of digits exceeds what an IV or NV can store, or it has leading zeros.)

I don't see why you think Zz should become Aa (unless you're thinking of modular arithmetic, but this isn't). It becomes AAa through this process:

  1. Incrementing z wraps around to a. Increment the previous character.
  2. Incrementing Z wraps around to A. There is no previous character, so add the first one from this range, which is another A.

The range operator (..), when given two strings (and the left-hand one matches the pattern), uses the string increment to produce a list (this is explained near the end of that section). The list starts with the left-hand operand, which is then incremented until either:

  1. The value equals the right-hand operand, or
  2. The length of the value exceeds the length of the right-hand operand.

It returns a list of all the values. (If case 2 terminated the list, the final value is not included in it.)


  1. Because (ignoring case for the moment; case is merely preserved, nothing interesting happens with it), 'AA' is the successor to 'Z', so how could it also be the successor to 'ZZ'? The successor to 'ZZ' is 'AAA'.

  2. Because as far as ++ and all other numeric operators are concerned, "9z" is just a silly way of writing 9, and the successor to 9 is 10. The special string behavior of auto-increment is clearly specified to only occur on strings of letters, or strings of letters followed by numbers (and not mixed in any other way).


The answer is to not do that. The automagic incrementing of ++ with non-numbers is full of nasty pitfalls. It is suitable only for quick hacks.

You are better off writing your own iterator for this sort of thing:

#!/usr/bin/perl

use strict;
use warnings;

{ package StringIter;

    sub new {
        my $class = shift;
        my %self  = @_;
        $self{set}   = ["a" .. "z"] unless exists $self{set};
        $self{value} = -1           unless exists $self{value};
        $self{size}  = @{$self{set}};

        return bless \%self, $class;
    }

    sub increment {
        my $self = shift;
        $self->{value}++;
    }

    sub current {
        my $self = shift;
        my $n    = $self->{value};
        my $size = $self->{size};
        my $s    = "";

        while ($n >= $size) {
            my $offset  = $n % $size;
            $s          = $self->{set}[$offset] . $s;
            $n         /= $size;
        }
        $s = $self->{set}[$n] . $s;

        return $s;
    }

    sub next {
        my $self = shift;
        $self->increment;
        return $self->current;
    }
}

{
    my $iter = StringIter->new;

    for (1 .. 100) {
        print $iter->next, "\n";
    }
}

{
    my $iter = StringIter->new(set => [0, 1]);

    for (1 .. 7) {
        print $iter->next, "\n";
    }
}

You're asking why increment doesn't wrap around.

If it did it wouldn't really be an increment. To increment means you have a totally ordered set and an element in it and produce the next higher element, so it can never take you back to a lower element. In this case the total ordering is the standard alphabetical ordering of strings (which is only defined on the English alphabet), extended to cope with arbitrary ASCII strings in a way that seems natural for certain common types of identifier strings.

Wrapping would also defeat its purpose: usually you want to use it to generate arbitrarily many different identifiers of some sort.

I agree with Chas Owens's verdict: applying this operation to arbitrary strings is a bad idea, that's not the sort of use it was intended for.

I disagree with his remedy: just pick a simple starting value on which increment behaves sanely, and you'll be fine.


I don't see why incrementing Zz would return Aa; why do you think it should? 9z incrementing looks like Perl thinks 9z is a number 9 rather than some kind of base-36 weirdness.


=> In case of alpha-numeric strings starting with a character like 'bz' or 'Zz' start moving from the right.The first character is 'z'.As you say there is nowhere for 'z' to increment so it increments to 'a' but an extra carry is given over to the next digit on the left.So 'b' increments to 'c'. Now in the second case Z does not see any alphabet to the left of it.In such cases an extra copy of the current digit is created as it gets incremented.

=> In case of alpha-numeric strings starting with a digit like '9z', perl considers it as a mistake made by the user and considers it as the number which precedes the string (in this case 9) and increments the number. So 9 becomes 10.

Plz. correct me if I am wrong


Need Your Help

java distributed cache for low latency, high availability

java caching dht trading

I've never used distributed caches/DHTs like memcached, jboss cache, ehcache, etc. I'm wondering which, if any, is appropriate for my use.

Intellij (Android studio) member variable prefix

java intellij-idea coding-style android-studio

How do I configure Android Studio (or Intellij generally) to correctly generate getters and setters for member variables with prefixes?