Regular Expression to get comments in VB.Net source code

I have a syntax highlighting function in vb.net. I use regular expressions to match "!IF" for instance and then color it blue. This works perfect until I tried to figure out how to do comments.

The language I'm writing this for a comment can either be if the line starts with a single quote ' OR if anywhere in the line there is two single quotes

'this line is a comment
!if StackOverflow = "AWESOME" ''this is also a comment

Now i know how to see if it starts with a single line ^' but i need to to return the string all the way to the end of the line so i can color the entire comment green and not just the single quotes.

You shouldn't need the code but here is a snippet just in case it helps.

    For Each pass In frmColors.lbRegExps.Items
        RegExp = System.Text.RegularExpressions.Regex.Matches(LCase(rtbMain.Text), LCase(pass))
        For Each RegExpMatch In RegExp
            rtbMain.Select(RegExpMatch.Index, RegExpMatch.Length)
            rtbMain.SelectionColor = ColorTranslator.FromHtml(frmColors.lbHexColors.Items(PassNumber))
        Next
        PassNumber += 1
    Next

Answers


Something along the lines of:

^(\'[^\r\n]+)$|(''[^\r\n]+)$

should give you the commented line (of part of the line) in group n° 1

Actually, you do not even need group

^\'[^\r\n]+$|''[^\r\n]+$

If it finds something, it is a comment.

"(^'|'').*$"

mentioned by Boaz would work if applied only line by line (which may be your case). For multi-line detection, you must be sure to avoid the 'Dotall' mode, where '.' stands also for \r and \n characters. Otherwise that pattern would match both your lines entirely.

That is why I generally prefer [^\r\n] to '.': it avoids any dependency to the mode of the pattern. Even in 'Dotall' mode, it still works and avoids trying any match on the next line.


While the above would work you can simplify it:

"(^'|'').*$"

As VonC mentions - this would only work if you feed the Regex one line at a time. For multi line mode use:

"(^'|'').*?$"

The ? makes the * operator not be greedy , forcing the regex to match a single line.


Using the regex pattern: REM((\t| ).*$|$)|^\'[^\r\n]+$|''[^\r\n]+$

see more https://code.msdn.microsoft.com/How-to-find-code-comments-9d1f7a29/


Need Your Help

Token Pattern for Android

android multithreading asynchronous android-asynctask token

I have an Android project in which I get a token from the server for authentication and use it for further queries to the server. The token expires approximately in 10 minutes so I have to fetch a ...

Save video URL to Core Data

ios objective-c cocoa-touch uiimagepickercontroller

In my app, I utilize the UIImagePickerController to record a video. I then save the video URL from the camera roll into core data, and whenever I want to play it, I pull the URL and do so. However,...