Jump to content

Regular expression to limit types of chars appearing in positions along a string


j.smith1981

Recommended Posts

I have been dabbling with regular expressions again with PHP.

 

Just finding these examples I am running from on:

http://www.regular-expressions.info/charclass.html

 

Their good tutorials but I am finding it hard getting away from them.

 

I am having problems implementing an idea of mine and just wanted your opinions on how to get around a little problem, I mean what I wanted was a username where the user is able to register a username that must start with a-z or 0-9 but can include . and _ after the first character.

 

So usernames like '09_myusername' would be valid but '_myusername' or '.myusername' would be invalid, does that make any sense?

 

How would I go about doing this, does anyone have any ideas they could walk me through?

 

Massive thanks to anyone that can help me in advance,

Jeremy.

Link to comment
Share on other sites

Sorry just to recap on one I have created myself just to show range classes in regular expressions, I just wanted to confirm I am thinking about this correctly.

 

I am doing this one here:

<?php
preg_match('/[a-z]/', 'this could mean anything between a-z', $matches);

print_r($matches);

 

It of course only comes up with a lowercase t so this would be true if it was put into an if condition, but it doesn't need to check past the first occurence of the first letter of the string right? If it must then I would have to set that in right?

 

If it was to contain anything from 0 to 9 then it would come back not allowed right?

 

Just wanted to get my head around this a bit more, just going off the tips from that set of tutorials in my above post.

Link to comment
Share on other sites

That's correct.

 

You can accomplish what you're after with:

 

/^[a-z0-9][a-z0-9_-]*$/

 

The first range limits a single character to a-z and 0-9, then the second is almost the same, but adds the extra two characters. Be aware that the hyphen must be at the end of the characters unless escaped with a blackslash ("\-"). The asterix on the end means that range must be matched '0 or more times'. The leading "^" and ending "$" mean the start and end of the string.

 

If you want the second character to be required, then you could change it to a plus, which means '1 or more times'. Or you could specify a minimum & maximum on the end using "{min, max}":

 

/^[a-z0-9][a-z0-9_-]{2,9}$/

 

That would require a username between 3 and 10 characters, starting with a-z / 0-9.

Link to comment
Share on other sites

MrAdam beat me out in helping you with your problem, but if you're struggling I have a couple of things you may want to look at.  The two resources that have helped me out immensely with regular expressions are first a book Mastering Regular Expressions and the second is an application called RegExBuddy, its named a bit ridiculous but the free trial and $40 price tag are well worth it.

Link to comment
Share on other sites

Oh thank you so much for your help.

 

Looking at your example of what I would need to do and reading has helped allot but as per usual I would have to type it out to really understand it, can think of a way of say changing it, to make it do something I anticipate but won't necessarily use to explain it further to myself as such.

 

I mean that (not being funny), but really massively helps me understand something like one of the things I have just done, doesn't make any sense in spoken language terms but has helped me out a bit in classes like so:

 

<?php
preg_match('/gr[ael]y/', 'The shirt was grey');
preg_match('/gr[ael]y/', 'The shirt was gray');
preg_match('/gr[ael]y/', 'The shirt was grly');

 

This helped me make sense of a very simple class, just yea adapting theories to real situations is what I am finding really hard.

 

But with the links in the 2nd response, I really appreciate it when someone goes out of their way to help me, really going for as much info as I can possibly get on regex's, was almost giving up with them about a month ago and the stuff I didnt really understand is making sense.

 

Going by another example I just wanted to do out of my own interest was to make a valid say literal URL like so:

 

<?php

preg_match('/http:\/\/mysite\.co\.uk/', 'http://mysite.co.uk');

 

The last one taught me that each forward slash is a meta char yes?

 

So I escape each one, one of the first mistakes I literally forgot about was the full stop meta char and then escaping that so it becomes a literal (is that the comparison term for that then?)

 

I will look at that link and have a look at the books contents on the web, really won't be too costly if I go on amazon market place (where I tend to buy all my books from really).

 

Thanks ever so much you 2, much appreciated!

 

Jeremy.

Link to comment
Share on other sites

/ isn't a metachar.  It doesn't have special significance for the php pcre (preg_xxx) functions, except that you are using it as your pattern delimiter.  That's why it must be escaped.  If you were to change your pattern delimiter to something else, like ~ then you won't have to escape it.

Link to comment
Share on other sites

I have ordered that book and am reading some of it from a site I am subscribed to, just thought I would read through it to get a head start and purchase the first book as a good one to go for.

 

Just a question about that answer I wanted to go over in my own words to make sure I understand what this means:

 

/^[a-z0-9][a-z0-9_-]*$/

 

This means start with the caret metachar yes?

 

This then means the first char must be either a-z or 0-9, but one thing I don't mind having is upper or lowercase chars in which ever position the user wants so MyUsername or myUsername123 would work.

 

This would then be something like:

 

/^[a-z0-9]i[a-z0-9_-]*$/

 

Would this work, where i is the case insensitive metachar, also on another note actually this would mean the first char would could be either A-Z or a-z and then 0-9, I have noticed when looking at one of them why I found - worked, which again is valid, so I see exactly what I did before, which is good.

 

How would I allow for case insensitivity in the rest of the regular expression without just limiting it to the first char, would I just put this in at the end possibly?

 

I don't need to go any further but I understand it and will write it in my own codes comments about what it does for my own reference.

Link to comment
Share on other sites

The case-insensitive flag is a 'modifier'. All modifiers are applied globally to an expression, and as such are placed after the right-side delimiter, to separate them from the rest of the expression:

 

/^[a-z0-9][a-z0-9_-]*$/i

 

 

This means start with the caret metachar yes?

 

The caret character means the start of the string. If used, there can be nothing within the string that doesn't start with a match for the expression. If you also apply a dollar to the end of the expression, then that means there can be nothing in the string after the expression too. If you didn't use the caret, "_myUsername" would still be considered a match. That's because the expression is only matching "_myUsername".

Link to comment
Share on other sites

I think I can declare this thread closed now, but just been going through the first set of examples in that book mastering regular expressions and have to say I have probably learnt more today than I have in the 2 months I have been studying regular expressions.

 

Actually stepping out and doing it entirely on my Linux server, mimicing the examples in the book, by creating the searchable files and then seeing how they compare etc.

 

It all entirely makes sense, one thing is the ^ $ which seeing that/those rather in a different explanations really helped a great deal, so thank you!

 

Got a few ideas now of what I want to do just to experiment with them but that book's a very good read, will continue on and if I have any trouble will make another post highlighting that problem I am having and question what I have done.

 

Thank you so very much!

Jeremy.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.