Wordpress search failed on special characters due to improper decode

I am implementing the Wordpress search functionality. When I search for text "Division's" (which is a text in one of the posts), It returns "No results found"

Now to investigate further, I checked the core file: wp-includes/query.php => function parse_search()

And found that the $term is received encoded as : Division\xe2\x80\x99s

Now this term is not decoded properly. And the final SQL statement formed is : (((test_posts.post_title LIKE '%Division\xe2\x80\x99s%') OR (test_posts.post_content LIKE '%Division\xe2\x80\x99s%')))

So, I want to decode the special characters to succesfully search terms with special characters too.

Decoding methods like :

  • $string = urldecode($string);
  • $string = html_entity_decode($string);
  • $string = rawurldecode ($string);
  • $string = base64_decode($string);
  • $string = utf8_decode($string);

Did not work. Is there any plugin/hook/method that can help?

Example Provided:

Simple searchform.php file here:

if (!defined('ABSPATH')) exit(0); 

global $wp_query;

$search_query = get_search_query();
$error = get_query_var('error'); ?>

<form role="search" method="get" class="search-form form-inline" action="<?php echo esc_url(home_url('/')); ?>">
    <input id="mod-search-searchword" type="search" size="30" class="inputbox search-query search-field" placeholder="search products, content" value="<?php echo !empty($search_query) && empty($error) ? $search_query : ''; ?>" name="s" title="Search for:" />
    <input type="submit" class="button btn btn-primary" value="Search" />
</form>

Now if I type in characters like () they get urlencoded, and that same urlencoded string not populates into the text field with the percentages, etc.

If I do this:

$search_query = !empty($search_query) ? trim(sanitize_text_field(urldecode($search_query))) : '';

There is still a problem, but no longer a problem with the text input not having correct string, the problem becomes that there are no search results now.

How to fix this issue with Wordpress Search?

wp-config.php contains the following:

define('DB_CHARSET', 'utf8');
define('DB_COLLATE', '');

header.php contains the following:

<!DOCTYPE html>
    <head>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=3.0, user-scalable=yes"/>
        <meta name="HandheldFriendly" content="true" />
        <meta name="apple-mobile-web-app-capable" content="YES" />
        <link rel="shortcut icon" href="<?php echo get_stylesheet_directory_uri(); ?>/favicon.ico" type="image/vnd.microsoft.icon" />
        <title><?php wp_title(' - ', true, 'right'); ?></title>
        <?php wp_head(); ?>
    </head>

I have the following in my functions.php file:

function livchem_searchfilter($query) {

    global $search_query;

    if ($query->is_search && !is_admin()) {

        // check if length of query > 3 but < 200
        $search_query = trim(get_search_query());
        $search_length = strlen($search_query);

        if ($search_length < 3 || $search_length > 200)
        {
            $query->set('error', 'Search term must be a minimum of 3 characters and a maximum of 200 characters.');
            return $query;
        }
        else
        {
            $query->set('post_type', array('post', 'page', 'product'));
            $query->set('posts_per_page', 20);
        }
    }

    return $query;
}

add_filter('pre_get_posts','livchem_searchfilter');

So, I do have UTF-8 encoding as my charset afaik. What is the problem, why is my search for: copper(i)/(ii) returning ?s=copper%2528i%2529%252F%2528ii%2529 in the URL? And I should have 2 results found for this, but I get 0 results found. Why?

And if I change the url to this: ?s=copper(i)/(ii) I see my 2 results. But why can't I get my results, and/or the url to be like this? I could honestly care less on what the url structure is, but I do want my 2 results to be found when I type in: copper(i)/(ii) into the search form, but currently it is not finding any results.

Answers


Ok, so you do have to decode the search query and here's how I have it working and works like a charm now! This now returns the search results, but keeps the url encoded so no problems anywhere here.

function livchem_search_filter($s) {
    return urldecode($s);
}

add_filter('get_search_query', 'livchem_search_filter');
add_filter('the_search_query', 'livchem_search_filter');

function livchem_query_vars_search_filter($query)
{
    if ($query->is_search && !is_admin()) {
        $query->query_vars['s'] = urldecode($query->query_vars['s']);
    }

    return $query;
}
add_action('parse_query', 'livchem_query_vars_search_filter');

As a plus this also works well for path related searches now, so if I added the following to my .htaccess:

RewriteCond %{QUERY_STRING} s=(.*)
RewriteRule ^$ /search/%1? [R,L]

Searches would be structured like so: /search/searchterm

And the query with special characters also now works. what a pain in the neck this was to get working properly, for something that is part of the CMS.


Need Your Help

C#: New line and tab characters in strings

c#

StringBuilder sb = new StringBuilder();

Jboss 7 war deployment failed

deployment jboss war

Jboss 7 war deployment failed and i got the below error message in the log.