Greetings,

I've been working on a Validation class for .NET using Regular Expressions. It's fairly complete, I have phone numbers, various numeric types, US/International Phones, US/Canadian Zip codes, US States, US SSN, and so on... mostly work (an error-prone, un-updated version is available on www.planet-source-code.com in the .NET section I'll update tonight with all my correction (it was premature to post last night)).

But I can't figure this out. I want to have a username/password validator with some rules. Must be n characters long, must have n CAPS, n numerics, and n special characters (n can be 0 or more). But I can't get it working (too complex). Even if I search each seperately, I still can't even do something as simple as 2 CAPS in a string, no matter where they ocurr. Anyway have ideas on that issue?

^(?=.*){2,}

is what I'm using but it's not working. Perl syntax is welcome. I just want to see something that works and I'll deal with it from there.

Thanks,
Shawn
Posted on 2003-03-27 21:58:49 by _Shawn
Originally posted by _Shawn
Must be n characters long, must have n CAPS, n numerics, and n special characters (n can be 0 or more)


So empty passwords are allowed? ;)
Posted on 2003-03-28 01:11:06 by bazik

^(?=.*){2,}

What is "="? I have not seen it from POSIX. Or, maybe I'm falling behind. Anyhow, wouldn't it match a string with repeated substring ending in a capital letter, like "ab1Aab1A", instead of "ab1Acd2Z"?

Well, what I said ignores "=". If "=" is the key to generate the desired result, just ignore my comment. :)


If you are free to ditch regex, what about xlat and then count the number of characters falling into the required character class(es)?
Posted on 2003-03-28 03:02:11 by Starless
trying to match for so many things in one reg ex is a bad idea, break it down:



// caps
"[A-Z]{2,3}" // at least 2, at most 3 caps (a)
"0-9{2,3}" // ... (b)
..
...
"[A-Z]{2, 3}|[0-9]{2, 3}" // (a) or (b)


...

anyway, the point is, search for only one at a time.. regex isn't exactly suited to "a at least n times anywhere and b at least y times anywhere..."

at least in my opinion.
Posted on 2003-03-28 04:37:44 by abc123
I imagine there's a way to do it using conditional subpatterns but like abc123 said, you're better off doing separate matches.

I still can't even do something as simple as 2 CAPS in a string


Match n or more CAPS:
([^A-Z]*){n,}

Match n or more digits:
([^\d]*\d){n,}
Posted on 2003-03-28 10:17:41 by iblis
Thanks for the replies... here's how I ended up solving the problem... I'm still trying to figure out how to get this reduced to one expression but for now, this will do...

The "=" is a .NET syntax specific, I think. Here is what the MSDN docs say
(?= ) Zero-width positive lookahead assertion. Continues match only if the subexpression matches at this position on the right. For example, \w+(?=\d) matches a word followed by a digit, without matching the digit. This construct does not backtrack.


[size=9]

// Validate if the supplied string matches a certain criteria, for example, if
// a UserName meets conditions, or a Password meets conditions...
//
public static bool IsCriteriaAcceptable(
string value,
int minCaps,
int minSpecialChars,
int minNumbers,
int minLength,
int maxLength
) {

// First check and ensure the minimum length is met
//
if (!IsMinimumLength(ref value, minLength)) {
return false;
}

// Make sure the text isn't more than the maximum length specified
//
if (value.Length > maxLength) {
return false;
}

// Check is see if there are minimum number of caps supplied
//
if (!Regex.IsMatch(value, @"([A-Y].*){" + minCaps.ToString() + @"}")) {
return false;
}

// Check to see if there are minimum number of numbers supplied
//
if (!Regex.IsMatch(value, @"([0-9].*){" + minNumbers.ToString() + @"}")) {
return false;
}


// Check to see if there are minimum number of special characters supplied
//
if (!Regex.IsMatch(value, @"([_~!@#\$%^&\-_+=;:\.,].*){" + minSpecialChars.ToString() + @"}")) {
return false;
}

return true;
}
[/Size]



Thanks,
_Shawn
Posted on 2003-03-29 01:34:14 by _Shawn
...(.*)...


Why A-Y?

The .* might be dangerous. If the RegExp is in "greedy" mode (which I think is the default), then .* will match as many consecutive characters as possible. So matching (.*) on a string like "HELLO" will match the whole string and the quantifier condition {n} will fail if n > 1.

Just a heads up. I don't know anything about the RegExp lib you're using so it might not apply.
Posted on 2003-03-29 09:46:30 by iblis
Iblis,

The was a typo, I meant (.*). Anyway, I've very well tested, it's not in greedy mode by default... whatever the case, the true syntax (if you take away the string concatenations) is

(.*){2} which means (at least 2 repetititions -- or two matches -- anywhere in the search string). Actually, it stops counting after two since the criteria has matched.


If you have the .Net framework installed, let m eknow, I'll send you the final project. It works well.

Thanks,
_Shawn
Posted on 2003-03-29 14:28:12 by _Shawn
I don't have .NET. It frightens me. ;)
Posted on 2003-03-29 15:03:02 by iblis

Iblis,
(.*){2} which means (at least 2 repetititions -- or two matches -- anywhere in the search string). Actually, it stops counting after two since the criteria has matched.



.... doesn't that mean a capital followed by any amount of any other characters, twice ?
Posted on 2003-03-29 16:03:51 by abc123
the Regex:
(.*){ 2 }
will match exactly 2 times, which means more than 2 repetitions will yield false
to match 2 or more repetitions:
(.*){ 2, }
Posted on 2003-03-29 16:22:09 by hosam_shahin
abc123,

Whether or not characters appear after a CAPITAL isn't the matter. Thus

"anAB" will work
"Andb39B" will work
"AA" will work
"iiiiiiiiiiiiiXiiiiiiiiiiiXiiiiiiiiiiiii" will work
"xxxxxxxxxxYxxxxxxxxxYxxxxxxxxxY" will also work
"Xiiiiiiiii" will not work (because there is only one CAP)


Etc.

The ".*" really means 0 or more other characters can appear after the CAP, and because of the group (and the lack of "^" before the group) the CAP can appear anywhere but another CAP must following either immidiately or or later in the string to be successfully matched. But because of the {2} it stops caring after the second match (if any).


Thanks,
_Shawn
Posted on 2003-03-29 16:22:25 by _Shawn
Hasom,

I get the correct results with {2} for both VB.NET and C#... even if I have more than 2 CAPS and if they are not successive to each other. I'm just specifying if there are two, I don't care of there are more. Of course, {2,} will also work. Actually, I know that the syntax rules would call for {2, } but for some reason, the .NET RegEx parser accepts both. If I place a "$" at the end or a "^" in the beginning, I must use {2, } or it won't match.


Thanks,
_Shawn
Posted on 2003-03-29 16:23:32 by _Shawn

abc123,
Whether or not characters appear after a CAPITAL isn't the matter. Thus


i believe the only reason "AbbbCdddEddd" is matching is because its greedy, the {2}, like hosam_shahin said, will match exactly two

you should make it non-greedy:
(.*?){2}

then it will work appropriately, and if you want to match two or more:
(.*?){2,}
Posted on 2003-03-29 16:30:58 by abc123
Yes, it works fine, I just tried it.
Posted on 2003-03-29 16:34:23 by hosam_shahin
Hosam,

I should probably use {2,} anyway in case it's a defect in the parser and gets corrected in the future.


Thanks,
_Shawn
Posted on 2003-03-29 16:40:22 by _Shawn
You'll notice though that _Shawn said it's in ungreedy mode so the ? is unnecessary.
Posted on 2003-03-29 16:43:30 by iblis
Attached is a snapshot of my little test program. There are about 25 different Validations and variations... for example, the Number validator actually has 5 different RegEx's to valide diiferent types of numeric formats...

The library is actually designed to test the Form values and QueryString values of an ASP.NET page, rather than just data in general.

The QS: field validator actually is called IsSafeQS and will return false if there is as script or any javascript in the field (to help prevent cross-site scripting attacks) and matches certain characters, such as <script> </script> { } ' ;

I am working on a a SQL injection attack validator also, that will detect is SQL is embedded in a QS value or form value.


Thanks,
_Shawn
Posted on 2003-03-29 16:43:53 by _Shawn

You'll notice though that _Shawn said it's in ungreedy mode so the ? is unnecessary.


I don't think it can be if it matches "AxxxBxxxCxxx" as he suggested...
Posted on 2003-03-29 16:50:24 by abc123
(.*) = find any capital letter followed by 0 or more arbitrary characters and store result in $1 (ungreedy mode)
{2,} = find two (or more) consecutive instances of previous group.

It will first match:

AxxBxxCxx

$0 = AxxB
$1 = B

....


2 instances matched.


Just because it is ungreedy doesn't mean it will always be ungreedy.
Posted on 2003-03-29 17:08:19 by iblis