How would you implement the pilloried function in the Daily WTF?

The Daily WTF for 2008-11-28 pillories the following code:

static char *nice_num(long n)
{
    int neg = 0, d = 3;
    char *buffer = prtbuf;
    int bufsize = 20;

    if (n < 0)
    {
        neg = 1;
        n = -n;
    }
    buffer += bufsize;
    *--buffer = '\0';

    do
    {
        *--buffer = '0' + (n % 10);
        n /= 10;
        if (--d == 0)
        {
            d = 3;
            *--buffer = ',';
        }
    }
    while (n);

    if (*buffer == ',') ++buffer;
    if (neg) *--buffer = '-';
    return buffer;
}

How would you write it?

Answers


If you're a seasoned C programmer, you'll realize this code isn't actually that bad. It's relatively straightforward (for C), and it's blazingly fast. It has three problems:

  1. It fails on the edge case of LONG_MIN (-2,147,483,648), since negating this number produces itself in twos-complement
    • It assumes 32-bit integers - for 64-bit longs, a 20-byte buffer is not big enough
    • It's not thread-safe - it uses a global static buffer, so multiple threads calling it at the same time will result in a race condition

Problem #1 is easily solved with a special case. To address #2, I'd separate the code into two functions, one for 32-bit integers and one for 64-bit integers. #3 is a little harder - we have to change the interface to make completely thread-safe.

Here is my solution, based on this code but modified to address these problems:

static int nice_num(char *buffer, size_t len, int32_t n)
{
  int neg = 0, d = 3;
  char buf[16];
  size_t bufsize = sizeof(buf);
  char *pbuf = buf + bufsize;

  if(n < 0)
  {
    if(n == INT32_MIN)
    {
      strncpy(buffer, "-2,147,483,648", len);
      return len <= 14;
    }

    neg = 1;
    n = -n;
  }

  *--pbuf = '\0';

  do
  {
    *--pbuf = '0' + (n % 10);
    n /= 10;
    if(--d == 0)
    {
      d = 3;
      *--pbuf = ',';
    }
  }
  while(n > 0);

  if(*pbuf == ',') ++pbuf;
  if(neg) *--pbuf = '-';

  strncpy(buffer, pbuf, len);
  return len <= strlen(pbuf);
}

Explanation: it creates a local buffer on the stack and then fills that in in the same method as the initial code. Then, it copies it into a parameter passed into the function, making sure not to overflow the buffer. It also has a special case for INT32_MIN. The return value is 0 if the original buffer was large enough, or 1 if the buffer was too small and the resulting string was truncated.


Hmm... I guess I shouldn't admit this, but my int to string routine for an embedded system work in pretty much exactly the same way (but without putting in the commas).

It's not particularly straightforward, but I wouldn't call it a WTF if you're working on a system that you can't use snprintf() on.

The guy who wrote the above probably noted that the printf() family of routines can't do comma grouping, so he came up with his own.

Footnote: there are some libraries where the printf() style formatting does support grouping, but they are not standard. And I know that the posted code doesn't support other locales that group using '.'. But that's hardly a WTF, just a bug possibly.


That's probably pretty close to the way I would write it actually. The only thing I can immediately see that is wrong with the solution is that is doesn't work for LONG_MIN on machines where LONG_MIN is -(LONG_MAX + 1), which is most machines nowadays. I might use localeconv to get the thousands separator instead of assuming comma, and I might more carefully calculate the buffer size, but the algorithm and implementation seem pretty straight-forward to me, not really much of a WTF for C (there are much better solutions for C++).


Lisp:

(defun pretty-number (x) (format t "~:D" x))

I'm suprised how easily I could do this. I'm not even past the first chapter in my Lisp book. xD (Or should I say, ~:D)


size_t
signed_as_text_grouped_on_powers_of_1000(char *s, ssize_t max, int n)
{
    if (max <= 0)
        return 0;

    size_t r=0;
    bool more_groups = n/1000 != 0;
    if (more_groups)
    {
       r = signed_as_text_grouped_on_powers_of_1000(s, max, n/1000);
       r += snprintf(s+r, max-r, ",");
       n = abs(n%1000);
       r += snprintf(s+r, max-r, "%03d",n);
    } else
       r += snprintf(s+r, max-r, "% 3d", n);

    return r;
}

Unfortunately, this is about 10x slower than the original.


In pure C:

#include <stdio.h>
#include <limits.h>

static char *prettyNumber(long num, int base, char separator)
{
#define bufferSize      (sizeof(long) * CHAR_BIT)
        static char buffer[bufferSize + 1];
        unsigned int pos = 0;

        /* We're walking backwards because numbers are right to left. */
        char *p = buffer + bufferSize;
        *p = '\0';

        int negative = num < 0;

        do
        {
                char digit = num % base;
                digit += '0';

                *(--p) = digit;
                ++pos;

                num /= base;

                /* This the last of a digit group? */
                if(pos % 3 == 0)
                {
/* TODO Make this a user setting. */
#ifndef IM_AMERICAN
#       define IM_AMERICAN_BOOL 0
#else
#       define IM_AMERICAN_BOOL 1
#endif
                        /* Handle special thousands case. */
                        if(!IM_AMERICAN_BOOL && pos == 3 && num < base)
                        {
                                /* DO NOTHING */
                        }
                        else
                        {
                                *(--p) = separator;
                        }
                }
        } while(num);

        if(negative)
                *(--p) = '-';

        return p;
#undef bufferSize
}

int main(int argc, char **argv)
{
        while(argc > 1)
        {
                long num = 0;

                if(sscanf(argv[1], "%ld", &num) != 1)
                        continue;

                printf("%ld = %s\n", num, prettyNumber(num, 10, ' '));

                --argc;
                ++argv;
        };

        return 0;
}

Normally I'd return an alloc'd buffer, which would need to be free'd by the user. This addition is trivial.


I got bored and made this naive implementation in Perl. Works.

sub pretify {
    my $num = $_[0];
    my $numstring = sprintf( "%f", $num );

    # Split into whole/decimal
    my ( $whole, $decimal ) = ( $numstring =~ /(^\d*)(.\d+)?/ );
    my @chunks;
    my $output = '';

    # Pad whole into multiples of 3
    $whole = q{ } x ( 3 - ( length $whole ) % 3 ) . $whole;

    # Create an array of all 3 parts.
    @chunks = $whole =~ /(.{3})/g;

    # Reassemble with commas
    $output = join ',', @chunks;
    if ($decimal) {
        $output .= $decimal;
    }

    # Strip Padding ( and spurious commas )
    $output =~ s/^[ ,]+//;

    # Strip excess tailing zeros
    $output =~ s/0+$//;

    # Ending with . is ugly
    $output =~ s/\.$//;
    return $output;
}

print "\n", pretify 100000000000000000000000000.0000;
print "\n", pretify 10_202_030.45;
print "\n", pretify 10_101;
print "\n", pretify 0;
print "\n", pretify 0.1;
print "\n", pretify 0.0001;
print "\n";

Need Your Help

Is there any way to check if an iterator is valid?

c++ performance stl iterator

For two threads manipulating a container map for example, what the correct way to test whether an iterator still valid (for performance reason) ?

Add custom translation Strings to liferay theme

java properties internationalization themes liferay

I want to add some Translations to my Theme to call it with #language("key") in my template.vm files.