Suddenly dawned on me that there's one regular expression I use almost every day:
(?<=TOKEN1).*?(?=TOKEN2)
What this does is pull all text out from between two other pieces of text matching token1 and token2. Very useful for parsing data from websites.
I should blog this with a website example.
What's your favourite regex?
My favourite RegEx
Moderators: Dorian (MJT support), JRL
- Marcus Tettmar
- Site Admin
- Posts: 7395
- Joined: Thu Sep 19, 2002 3:00 pm
- Location: Dorset, UK
- Contact:
My favourite RegEx
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar
Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar
Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?
- Dorian (MJT support)
- Automation Wizard
- Posts: 1380
- Joined: Sun Nov 03, 2002 3:19 am
- Contact:
I'd never used regex until I saw this post, and it prompted me to do a little research.
It's very powerful. I used it to strip out all the URLs from a text file. Here it is, in case it helps anyone.
It's very powerful. I used it to strip out all the URLs from a text file. Here it is, in case it helps anyone.
Code: Select all
//sample text
let>text=rfuhroiurnfifnroi http://www.fish.com fuh3ifuh34ifurf http://www.chips.com
// it seems pattern looks for http://, not www. so we'll add in the http://
stringreplace>text,www,http://www,text
// Find URLS
RegEx>[Hyperlink],text,1,matches,num,0
// write it all to a file
Let>k=0
Repeat>k
Let>k=k+1
WriteLn>%USERDOCUMENTS_DIR%\url output.txt,result,matches_%k%
Until>k,num
Yes, we have a Custom Scripting Service. Message me or go here
My favourite Regex
Marcus,
I use something similar to yours although I never quite understood the meaning of the zero-width positive lookbehind (?<=regex) or the zero-width positive lookahead (?=regex) modifiers.
What issues do you avoid by using those modifiers in this regex?
Jim
PS - There is a good regex reference here:
http://www.regular-expressions.info/refadv.html
I use something similar to yours although I never quite understood the meaning of the zero-width positive lookbehind (?<=regex) or the zero-width positive lookahead (?=regex) modifiers.
What issues do you avoid by using those modifiers in this regex?
Jim
PS - There is a good regex reference here:
http://www.regular-expressions.info/refadv.html
- Marcus Tettmar
- Site Admin
- Posts: 7395
- Joined: Thu Sep 19, 2002 3:00 pm
- Location: Dorset, UK
- Contact:
What these do is exclude those tokens from the match so that we get only the value between them.
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar
Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar
Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?