What accounts for the added time in selecting *?

I've noticed on a few queries on tables with a lot of columns that when I do:

SELECT column1, column2 FROM ...

it can be almost 2x as fast as when I do:

SELECT * FROM ... # ~ 15 columns

What accounts for the additional time it takes to return the * query? I initially thought that if both queries are retrieving the same rows, the difference in performance would be trivial. What accounts for this difference?

Answers


Although the number of rows is the same, the additional columns need to be transferred to the client, consuming network bandwidth. The additional time is needed for the same reason why downloading a file seven times as large takes longer. There may also be a need to read blocks from disk that might not otherwise be required, and in some cases even an inability to complete the query by reading only from an index. That's why you are encouraged to always specify your select list explicitly.


The first thing to do is look at "explain plan" to see if one query results in a different query plan than the other:

http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html


Given...

  1. SELECT column1, column2 FROM yourtable
  2. SELECT * FROM yourtable

MySQL performance can be altered by more than one factor, for example

  • sending the data back to the client,
  • freeing the items,
  • locking the table(s),
  • checking the query cache,
  • storing results into the query cache,
  • optimizing,
  • opening and closing the tables

... and a few more. You have to take all of those factors into consideration when trying to determine which way is faster.

In most cases, I go to query only the columns I want, therefore reducing unnecessary transfer load between application and database server. But... I have to admit, if it is only a few columns or if I want to test something, I sometimes take the SELECT * - shortcut, too.


Need Your Help

Gradient Descent ANN - What is MATLAB doing that I'm not?

python matlab machine-learning neural-network gradient-descent

I'm trying to recreate a simple MLP artificial neural network in Python using gradient descent backpropagation. My goal is to try and recreate the accuracies that MATLAB's ANN is producing, but I'm...

insert into a mysql database timestamp

python mysql python-2.7 insert

I have a part in my python script that I need to insert some data into a table on a mysql database example below: