Where do .raw fields come from when using Logstash with Elasticsearch output?
When using Logstash and Elasticsearch together, fields with .raw are appended for analyzed fields, so that when querying Elasticsearch with tools like Kibana, it's possible to use the field's value as-is without per-word splitting and what not.
I built a new installation of the ELK stack with the latest greatest versions of everything, and noticed my .raw fields are no longer being created as they were on older versions of the stack. There are a lot of folks posting solutions of creating templates on Elasticsearch, but I haven't been able to find much information as to why this fixes things. In an effort to better understand the broader problem, I ask this specific question:
Where do the .raw fields come from?
I had assumed that Logstash was populating Elasticsearch with strings as-analyzed and strings as-raw when it inserted documents, but considering the fact that the fix lies in Elasticsearch templates, I question whether or not my assumption is correct.
You're correct in your assumption that the .raw fields are the result of a dynamic template for string fields contained in the default index template that Logstash creates IF manage_template: true (which it is by default).
The default template that Logstash creates (as of 2.1) can be seen here. As you can see on line 26, all string fields (except the message one) have a not_analyzed .raw sub-field created.
However, the template hasn't changed in the latest Logstash versions as can be seen in the template.json change history, so either something else must be wrong with your install or you've changed your Logstash config to use your own index template (without .raw fields) instead.
If you run curl -XGET localhost:9200/_template/logstash* you should see the template that Logstash has created.