How to decide when use index on table column
When should I use index on table?
- From how many rows index makes sense?
- If I have table with constant rows, just edited come columns (not in 'where' clause), makes index sense even if table has just about 15 rows? EDIT: Can be in such case non-index selecting/reading more effective than index read?
EDIT: Now I am working with firebird 2.5, but most of the time I am using SQL Server 2005/2008.
In general, my indexing strategy would be something like this (I'm using SQL Server exclusively for now - adapt to your own database system as needed):
pick a good clustering key - not a GUID, not a VARCHAR(250) or something - a good clustering key is narrow, unique, stable, ever-increasing - something like a INT IDENTITY is perfect. Makes this your clustered primary key -> gives you your first index on the table
for any column that is being used as a foreign key into another table - add an index. It can either be a single column index - or it might be a compound index - whatever works best for your case. It's important that the foreign key column be the first column in that index (if you're using a compound index) - otherwise, the benefits for the JOIN's or for checking referential integrity won't be available to your system
And that's it for now.
Then: run your system - observe, and measure - establish a baseline. Is the app fast enough? If yes -> you're done - go home and enjoy your spare time.
If not: then start collecting data and indications as to why the app isn't fast enough. Look at e.g. things like the DMV's in SQL Server that tell you about the worst performing queries, or the missing index DMV. Analyze those. See what you could improve. Add one index at a time and again: observe, measure, compare to your baseline.
If you have improvement -> leave that index in place and this measurement is your new baseline. Rinse and repeat until you (and your users) are happy with the app's performance (and then go home and enjoy your time off).
Over-indexing in SQL Server can be worse than not having any indexes. Don't start out with too many indices to begin with! Only establish good clustered PK and foreign key nonclustered indices - that's all - then observe, measure, optimize & repeat that cycle.
That’s a very complex discussion, there are several things you have to keep in mind. Mainly you shouldn’t consider an index by the number of rows you have on your table but based on the queries you run against it. A index will help only select queries and at the same time it will slightly decrease performance of inserts, deletes and updates, because besides altering the row on the table, you have to alter the index.
You seem to be new on the matter so what I suggest is that you take a look at your execution plans and try to eliminate all “scans” operations because they pretty much read all table or even all index. You should look always to have seeks but yet you should balance it with the amount of indexes you have on your table.
If you are using SQL Server, you can run a trace with SQL Server profiler to help you
Can be in such case non-index selecting/reading more effective than index read?
yes, but if this case should occur, the engine will be smart enough not to use the index
An index is good for picking a fraction of the rows from a table. Querying by a primary key value is the best utilization of an index. The worst scenario is accessing all rows from a table via an index, because it has to read index pages and referenced data pages. Another example is that an in-memory sort of a result set might be faster than sorting a result set via an index on the sorted column. And never forget, while an index may improve query performance, an index decrease write performance.
Some have mentioned taking a baseline, using some kind of tracing utility to measure performance etc. If you are fine with the established performance, move on. If not, analyze the execution plan, the physical data model (available indexes), re-calculate statistics and see if the helps the optimizer to choose a better execution plan. Make sure that the DBMS can (is allowed to) utilize the available RAM. Try to minimize disk I/O and so on.
In case of Firebird 2.5, the newly added Firebird Trace API is a godsend. You are now finally able to get a near real-time trace on what's executed against the database with performance counters (execution plan, execution time, I/O stats ...). And a third-party product called FB TraceManager by Upscene Productions makes the Trace API a joy to use.
Regarding the secong part of your question, if a table has just 15 rows, it is very likely that no matter how many indexes you have the table will always be scanned because it is so small.
I use this query for getting a clue of which of my tables need an index:
-- Missing Indexes for current database by Index Advantage (Query 57) (Missing Indexes) SELECT DISTINCT CONVERT(decimal(18,2), user_seeks * avg_total_user_cost * (avg_user_impact * 0.01)) AS [index_advantage], migs.last_user_seek, mid.[statement] AS [Database.Schema.Table], mid.equality_columns, mid.inequality_columns, mid.included_columns, migs.unique_compiles, migs.user_seeks, migs.avg_total_user_cost, migs.avg_user_impact, OBJECT_NAME(mid.[object_id]) AS [Table Name], p.rows AS [Table Rows] FROM sys.dm_db_missing_index_group_stats AS migs WITH (NOLOCK) INNER JOIN sys.dm_db_missing_index_groups AS mig WITH (NOLOCK) ON migs.group_handle = mig.index_group_handle INNER JOIN sys.dm_db_missing_index_details AS mid WITH (NOLOCK) ON mig.index_handle = mid.index_handle INNER JOIN sys.partitions AS p WITH (NOLOCK) ON p.[object_id] = mid.[object_id] WHERE mid.database_id = DB_ID() ORDER BY index_advantage DESC OPTION (RECOMPILE);
Notice that this will only give you a north, you still need to take into account what have been answered above.