Regex to delete all words containing numbers from a sentence

I tried my best to remove all words in a sentence with numbers in them, but still unsuccessful! And I even tried the following regex:

$regex = '/(\\s+\\w{1,2}(?=\\W+))|(\\s+[a-zA-Z0-9_-]+\\d+)/';
$x=preg_replace($regex,"",$x);

I'm trying to accomplish the following :

Original text with words containing numbers and any special character such as - and _:

This is S3F8G m7j34m h98H40D-3D39 90843-432423 LSDF3 4X4it very good 343c3.

Final text needs to be as follows:

This is very good.

Answers


Well, I wrote in Javascript:

var str = 'This is S3F8G m7j34m h98H40D-3D39 90843-432423 LSDF3 4X4it very good 343c3.';
var result = str.match(/(^[\D]+\s|\s[\D]+\s|\s[\D]+$|^[\D]+$)+/g).join('');

But you can try this in PHP:

<?
$str = 'This is S3F8G m7j34m h98H40D-3D39 90843-432423 LSDF3 4X4it very good 343c3.';;
preg_match_all("/(^[\D]+\s|\s[\D]+\s|\s[\D]+$|^[\D]+$)+/",$str,$result);
$result = implode('',$result[0]);
echo $result;
?>

<?php

$x="This is S3F8G m7j34m h98H40D-3D39 90843-432423 LSDF3 4X4it very good 343c3.";
$re="/ ?\b[^ ]*[0-9][^ ]*\b/i";
print preg_replace($re, "", $x) . "\n";

returns:

This is very good.

One proviso: since this regex strips a leading space rather than a trailing one, if the FIRST word contains adigit, then the result will have a space prepended. Thus:

<?php

$x="9abc foo bar.";
$re="/ ?\b[^ ]*[0-9][^ ]*\b/i";
print preg_replace($re, "", $x) . "\n";

returns:

 foo bar
^

And here's a (hopefully) quirk free way (ideone):

$str = '-12x This is S3F8G m7j34m h98H40D-3D39 90843-432423 LSDF3 4X4it very good 343c3. foo bar';
echo preg_replace('/\s+[\w-]*\d[\w-]*|[\w-]*\d[\w-]*\s*/', '', $str);

Output:

This is very good. foo bar

Any special characters that you want to include as part of words you will need to add to the character classes.


Need Your Help

How to subscribe to messages created in a BizTalk orchestration?

biztalk biztalk-2010 biztalk-orchestrations

I have an orchestration that takes a message. The target namespace is "http://microsoft.com/HealthCare/HL7/2X" and the root element is "ORU_R01_23_GLO_DEF"