How can I get Sphinx to index in real time?

I have a Rails 3.2.6 app where I'm using Sphinx 0.9.9 and Thinking Sphinx 2.0.12.

I need Sphinx to update its index in real time. For example when a user creates a new post it will show up in a search immediately. Or if they delete a post, it won't show up, starting the instant they delete it.

I followed the docs about delta indexing.

Based on this advice I have a cron job that executes every twenty minutes and runs bundle exec rake ts:index RAILS_ENV=production...

Turning on delta indexing does not remove the need for regularly running a full re-index, as otherwise the delta index itself will grow to become just as large as the core indexes, and this removes the advantage of keeping it separate. It also slows down your requests to your server that make changes to the model records.

New entries only appear after that job runs.

Here's my define_index...

define_index do

  indexes(title)
  indexes(entry)

  has user_id
  has created_at
  has updated_at

  set_property :delta => true

end

Here's my production.sphinx.conf...

indexer
{
}

searchd
{
  listen = 127.0.0.1:9312
  log = /opt/deployed_rails_apps/my_app/releases/20120713022228/log/searchd.log
  query_log = /opt/deployed_rails_apps/my_app/releases/20120713022228/log/searchd.query.log
  pid_file = /opt/deployed_rails_apps/my_app/releases/20120713022228/log/searchd.production.pid
}

source entry_core_0
{
  type = mysql
  sql_host = localhost
  sql_user = abc
  sql_pass = abc
  sql_db = my_app_production
  sql_query_pre = UPDATE `entries` SET `delta` = 0 WHERE `delta` = 1
  sql_query_pre = SET NAMES utf8
  sql_query_pre = SET TIME_ZONE = '+0:00'
  sql_query = SELECT SQL_NO_CACHE `entries`.`id` * CAST(1 AS SIGNED) + 0 AS `id` , `entries`.`title` AS `title`, `entries`.`entry` AS `entry`, `entries`.`id` AS `sphinx_internal_id`, 0 AS `sphinx_deleted`, 3940594292 AS `class_crc`, `entries`.`user_id` AS `user_id`, UNIX_TIMESTAMP(`entries`.`created_at`) AS `created_at`, UNIX_TIMESTAMP(`entries`.`updated_at`) AS `updated_at` FROM `entries`  WHERE (`entries`.`id` >= $start AND `entries`.`id` <= $end AND `entries`.`delta` = 0) GROUP BY `entries`.`id` ORDER BY NULL
  sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1) FROM `entries` WHERE `entries`.`delta` = 0
  sql_attr_uint = sphinx_internal_id
  sql_attr_uint = sphinx_deleted
  sql_attr_uint = class_crc
  sql_attr_uint = user_id
  sql_attr_timestamp = created_at
  sql_attr_timestamp = updated_at
  sql_query_info = SELECT * FROM `entries` WHERE `id` = (($id - 0) / 1)
}

index entry_core
{
  source = entry_core_0
  path = /opt/deployed_rails_apps/my_app/releases/20120713022228/db/sphinx/production/entry_core
  charset_type = utf-8
}

source entry_delta_0 : entry_core_0
{
  type = mysql
  sql_user = abc
  sql_pass = abc
  sql_db = my_app_production
  sql_query_pre = 
  sql_query_pre = SET NAMES utf8
  sql_query_pre = SET TIME_ZONE = '+0:00'
  sql_query = SELECT SQL_NO_CACHE `entries`.`id` * CAST(1 AS SIGNED) + 0 AS `id` , `entries`.`title` AS `title`, `entries`.`entry` AS `entry`, `entries`.`id` AS `sphinx_internal_id`, 0 AS `sphinx_deleted`, 3940594292 AS `class_crc`, `entries`.`user_id` AS `user_id`, UNIX_TIMESTAMP(`entries`.`created_at`) AS `created_at`, UNIX_TIMESTAMP(`entries`.`updated_at`) AS `updated_at` FROM `entries`  WHERE (`entries`.`id` >= $start AND `entries`.`id` <= $end AND `entries`.`delta` = 1) GROUP BY `entries`.`id` ORDER BY NULL
  sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1) FROM `entries` WHERE `entries`.`delta` = 1
  sql_attr_uint = sphinx_internal_id
  sql_attr_uint = sphinx_deleted
  sql_attr_uint = class_crc
  sql_attr_uint = user_id
  sql_attr_timestamp = created_at
  sql_attr_timestamp = updated_at
  sql_query_info = SELECT * FROM `entries` WHERE `id` = (($id - 0) / 1)
}

index entry_delta : entry_core
{
  source = entry_delta_0
  path = /opt/deployed_rails_apps/my_app/releases/20120713022228/db/sphinx/production/entry_delta
}

index entry
{
  type = distributed
  local = entry_delta
  local = entry_core
}

Any ideas what I might be doing wrong?

Answers


I know this is old, but you should consider updating your Sphinx version and go for the RT model instead of the main+delta scheme.

Link to RT indexes - Sphinx Documentation


Need Your Help

Binding C-; in Emacs

emacs key-bindings

How can I bind a function to C-; in Emacs? I tried to use bracket notation with an escape character:

Memory error c++ private int

c++ class memory

I'm doing a project where I have to use a private int called numberOfMatches.