Improve GeSHi syntax highlighting for T-SQL

I'm using WP-GeSHi in WordPress, and largely I'm very happy with it. There are, however, a few minor scenarios where the color highlighting is too aggressive when a keyword is:

  1. a variable name (denoted by a leading @)
  2. part of another word (e.g. IN in INSERTED)
  3. the combination (part of a variable name, e.g. JOIN and IN in @JOINBING)
  4. inside square brackets (e.g. [status])

Certain keywords are case sensitive, and others are not. The below screenshot sums up the various cases where this goes wrong:

Now, the code in GeSHi.php is pretty verbose, and I am by no means a PHP expert. I'm not afraid to get my hands a little dirty here, but I'm hoping someone else has made corrections to this code and can provide some pointers. I already implemented a workaround to prevent @@ROWCOUNT from being highlighted incorrectly, but this was easy, since @@ROWCOUNT is defined - I just shuffled the arrays around so that it was found before ROWCOUNT.

What I'd like is for GeSHi to completely ignore keywords that aren't whole words (whether they are prefixed by @ or immediately surrounded by other letters/numbers). JOIN should be grey, but @JOIN and JOINS should not. I'd also like it to ignore keywords that are inside square brackets (after all, this is how we tell Management Studio to not color highlight it, and it's also how we tell the SQL engine to ignore reserved words, keywords, and invalid identifiers).

Answers


You can do this by adding a PARSER_CONTROL control to the end of the array:

'PARSER_CONTROL' => array(
    'KEYWORDS' => array(
        1 => array( // "1" maps to the main keywords near the start of the array
            'DISALLOWED_BEFORE' => '(?![\(\w])',
            'DISALLOWED_AFTER' => '(?![\(\w])'
        ),
        5 => array( // "5" maps to the shorter keywords like "IN" that are further down
            'DISALLOWED_BEFORE' => '(?![\(\w])',
            'DISALLOWED_AFTER' => '(?![\(\w])'
        ),
    )
)

Edit

I've modified your gist to move some of the keywords you added to SYMBOLS back to KEYWORDS (though in their own group and with your custom style), and I updated the PARSER_CONTROL array to match the new keyword array indexes and also to include the default regex that geshi generates. Here is the link:

https://gist.github.com/jamend/07e60bf0b9acdfdeee7a


According to me, what you are doing would take a lot of time. So, I suggest that you install a different plugin:

It has better features and supports more languages and in a better way. So, it would remove all these problems.

EDIT:

Hey, I tried out the same code with latest version and got following result-

EDIT:

So, if you don't want to use another plugin, then I'll tell you about the coding:

First open \wp-content\plugins\wp-geshi-highlight\geshi\geshi\tsql.php in your text editor.

Then, locate the array 'KEYWORDS' or search for it.

Add 6 to the last of it (after 5) and add your custom keywords in it. For example:

5 => array(
'ALL', 'AND', 'ANY', 'BETWEEN', 'CROSS', 'EXISTS', 'IN', 'JOIN', 'LIKE', 'NOT', 'NULL',
'OR', 'OUTER', 'SOME',
),

6 => array(                          //This line has been added by me
'status'                             //This line has been added by me
)                                    //This line has been added by me

Note: I have just shown array element 5 (already present) and array element 6 (which I have made).

Then, to make it case-sensitive add below code to the last of 'CASE_SENSITIVE' array:

6 => true

The 'CASE_SENSITIVE' array should look like this:

'CASE_SENSITIVE' => array(
GESHI_COMMENTS => false,
        1 => false,
        2 => false,
        3 => false,
        4 => false,
        5 => false,
        6 => true                         //This line has been added by me
        ),

Now, you will have to add styling to the custom keywords. This can be achieved by adding below line to the 'KEYWORDS' element of 'STYLES' array. The starting of 'STYLES' array should look like this:

'STYLES' => array(
        'KEYWORDS' => array(
            1 => 'color: #0000FF;',
            2 => 'color: #FF00FF;',
            3 => 'color: #AF0000;',
            4 => 'color: #AF0000;',
            5 => 'color: #808080;',
            6 => 'color: #0000FF;'          //This line has been added by me
            ),

You can solve your problems by above guidelines, but for the part in which the plugin highlights incomplete words, I have found only one solution, that you update your plugin to latest version, because it solves this problem.


Need Your Help

What is exactly the base pointer and stack pointer? To what do they point?

c++ c assembly x86

Using this example coming from wikipedia, in which DrawSquare() calls DrawLine(),

List file names based on a filename pattern and file content?

linux shell unix ftp grep

How can I use Grep command to search file name based on a wild card "LMN2011*" listing all files with this as beginning?