What indexes would I create for a query like this?
Here's a query I've got:
SELECT "balance_transactions"."fee" FROM "balance_transactions" JOIN charges ON balance_transactions.source = charges.balance_id WHERE "balance_transactions"."account_id" = 123 AND (balance_transactions.type = 'charge' AND charges.refunded = false AND charges.invoice IS NOT NULL) AND ("balance_transactions"."created" BETWEEN '2013-12-29' AND '2014-01-28');
What's hurting my brain is figuring out what I need to index to optimize a query like this.
Would I create individual indexes on each column in that WHERE statement? Or a single index that encompasses multiple columns? Or something else?
I happen to be using Postgres.
Here's the output of EXPLAIN ANALYZE:
Nested Loop (cost=40.44..9294.43 rows=744 width=4) (actual time=0.791..32.687 rows=1131 loops=1) -> Bitmap Heap Scan on balance_transactions (cost=40.02..2266.08 rows=1023 width=22) (actual time=0.737..1.006 rows=1201 loops=1) Recheck Cond: ((account_id = 79) AND ((type)::text = 'charge'::text) AND (created >= '2013-12-29 00:00:00'::timestamp without time zone) AND (created <= '2014-01-28 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on index_balance_transactions_account_type_created (cost=0.00..39.77 rows=1023 width=0) (actual time=0.716..0.716 rows=1201 loops=1) Index Cond: ((account_id = 79) AND ((type)::text = 'charge'::text) AND (created >= '2013-12-29 00:00:00'::timestamp without time zone) AND (created <= '2014-01-28 00:00:00'::timestamp without time zone)) -> Index Scan using index_stripe_charges_on_stripe_id on charges (cost=0.42..6.86 rows=1 width=18) (actual time=0.025..0.025 rows=1 loops=1201) Index Cond: ((balance_id)::text = (balance_transactions.source)::text) Filter: ((NOT refunded) AND (invoice IS NOT NULL)) Rows Removed by Filter: 0 Total runtime: 32.924 ms (10 rows)
I would suggest the following two indexes:
charges(balance_id, refunded, invoice) balance_transactions(account_id, type, created, source, fee)
The first is a covering index for charges. The index should be used for the query with no need for the data tables.
The second is a covering index for balance_transactions. The first three will satisfy the where clause. The four column will be used for the join and the final is needed for the select.