Is there something like a zip() function in PostgreSQL that combines two arrays?

I have two array values of the same length in PostgreSQL:

{a,b,c} and {d,e,f}

and I'd like to combine them into

{{a,d},{b,e},{c,f}}

Is there a way to do that?

Answers


Postgres 9.3 or older
Simple zip()

Consider the following demo for Postgres 9.3 or earlier:

SELECT ARRAY[a,b] AS ab
FROM  (
   SELECT unnest('{a,b,c}'::text[]) AS a
         ,unnest('{d,e,f}'::text[]) AS b
    ) x;

Result:

  ab
-------
 {a,d}
 {b,e}
 {c,f}

Note that both arrays must have the same number of elements to unnest in parallel, or you get a cross join instead.

You can wrap this into a function, if you want to:

CREATE OR REPLACE FUNCTION zip(anyarray, anyarray)
  RETURNS SETOF anyarray LANGUAGE SQL AS
$func$
SELECT ARRAY[a,b] FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;
$func$;

Call:

SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);

Same result.

zip() to multi-dimensional array:

Now, if you want to aggregate that new set of arrays into one 2-dimenstional array, it gets more complicated.

SELECT ARRAY (SELECT ...)

or:

SELECT array_agg(ARRAY[a,b]) AS ab
FROM  (
   SELECT unnest('{a,b,c}'::text[]) AS a
         ,unnest('{d,e,f}'::text[]) AS b
    ) x

or:

SELECT array_agg(ARRAY[ARRAY[a,b]]) AS ab
FROM  ...

will all result in the same error message (tested with pg 9.1.5):

ERROR: could not find array type for data type text[]

But there is a way around this, as we worked out under this closely related question. Create a custom aggregate function:

CREATE AGGREGATE array_agg_mult (anyarray) (
    SFUNC    = array_cat
   ,STYPE    = anyarray
   ,INITCOND = '{}'
);

And use it like this:

SELECT array_agg_mult(ARRAY[ARRAY[a,b]]) AS ab
FROM  (
   SELECT unnest('{a,b,c}'::text[]) AS a
         ,unnest('{d,e,f}'::text[]) AS b
    ) x

Result:

{{a,d},{b,e},{c,f}}

Note the additional ARRAY[] layer! Without it and just:

SELECT array_agg_mult(ARRAY[a,b]) AS ab
FROM ...

You get:

{a,d,b,e,c,f}

Which may be useful for other purposes.

Roll another function:

CREATE OR REPLACE FUNCTION zip2(anyarray, anyarray)
  RETURNS SETOF anyarray LANGUAGE SQL AS
$func$
SELECT array_agg_mult(ARRAY[ARRAY[a,b]])
FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;
$func$;

Call:

SELECT zip2('{a,b,c}'::text[],'{d,e,f}'::text[]); -- or any other array type

Result:

{{a,d},{b,e},{c,f}}
Postgres 9.4+

Use the ROWS FROM construct or the updated unnest() which takes multiple arrays to unnest in parallel. Each can have a different length. You get (per documentation):

[...] the number of result rows in this case is that of the largest function result, with smaller results padded with null values to match.

Use this cleaner and simpler variant:

SELECT ARRAY[a,b] AS ab
FROM   unnest('{a,b,c}'::text[] 
            , '{d,e,f}'::text[]) x(a,b);
Postgres 9.5+

ships array_agg(array expression):

Function                Argument Type(s)   Return Type
array_agg(expression)   any array type     same as argument data type  

Description
input arrays concatenated into array of one higher dimension
(inputs must all have same dimensionality, and cannot be empty or NULL)

This is a drop-in replacement for my custom aggregate function array_agg_mult()implemented in C which is considerably faster. Use it.


Here's another approach that's safe for arrays of differing lengths, using the array multi-aggregation mentioned by Erwin:

CREATE OR REPLACE FUNCTION zip(array1 anyarray, array2 anyarray) RETURNS text[]
AS $$
SELECT array_agg_mult(ARRAY[ARRAY[array1[i],array2[i]]])
FROM generate_subscripts(
  CASE WHEN array_length(array1,1) >= array_length(array2,1) THEN array1 ELSE array2 END,
  1
) AS subscripts(i)
$$ LANGUAGE sql;

regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);
         zip         
---------------------
 {{a,d},{b,e},{c,f}}
(1 row)


regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f,g}'::text[]);
             zip              
------------------------------
 {{a,d},{b,e},{c,f},{NULL,g}}
(1 row)

regress=> SELECT zip('{a,b,c,z}'::text[],'{d,e,f}'::text[]);
             zip              
------------------------------
 {{a,d},{b,e},{c,f},{z,NULL}}
(1 row)

If you want to chop off the excess rather than null-padding, just change the >= length test to <= instead.

This function does not handle the rather bizarre PostgreSQL feature that arrays may have a stating element other than 1, but in practice nobody actually uses that feature. Eg with a zero-indexed 3-element array:

regress=> SELECT zip('{a,b,c}'::text[], array_fill('z'::text, ARRAY[3], ARRAY[0]));
          zip           
------------------------
 {{a,z},{b,z},{c,NULL}}
(1 row)

wheras Erwin's code does work with such arrays, and even with multi-dimensional arrays (by flattening them) but does not work with arrays of differing length.

Arrays are a bit special in PostgreSQL, they're a little too flexible with multi-dimensional arrays, configurable origin index, etc.

In 9.4 you'll be able to write:

SELECT array_agg_mult(ARRAY[ARRAY[a,b])
FROM unnest(array1) WITH ORDINALITY as (o,a)
NATURAL FULL OUTER JOIN
unnest(array2) WITH ORDINALITY as (o,b);

which will be a lot nicer, especially if an optimisation to scan the functions together rather than doing a sort and join goes in.


Need Your Help

What causes "Neither PUB key nor PRIV key:: nested asn1 error" when building a public key in ruby?

ruby-on-rails ssl certificate rsa apple-push-notifications

When building a public key using the OpenSSL::PKey::RSA module by passing it a .pem file, what is the cause for a response:

How does C# generics affect collections with primitives

c# .net generics reification

As I understand it, C#/.Net generics support some degree of reification. So, if I have the following code: