Webrecorder Newbie!

Hints, tips and tricks for newbies

Moderators: Dorian (MJT support), JRL

Post Reply
Office Smith
Newbie
Posts: 15
Joined: Fri Sep 21, 2012 5:17 pm
Location: Phoenix, AZ

Webrecorder Newbie!

Post by Office Smith » Sat Sep 22, 2012 1:12 am

I've been trying to do a little web scraping, to create myself a listing of small businesses off of the local small business association.

I've been unable to figure out why this code is not pulling down some text.

I'm using windows 7 and IE 9. I've setup http://*.asba.com as a trusted site, and confirmed that trusted sites are not in protected mode, and that it is enabled for scripted browsing.

I've confirmed that the DLL for webrecorder is properly installed in the imports directory. And the script does open the internet explorer. However it always errors out when I attempt to extract the tag text. I'm assuming from what is listed below it should be pulling the inner text, but instead I get the following error:

Error In Line: 22 - Error Calling DLL: Possible wrong number or type of parameters? Message: The interface is unknown

I know I must be doing something wrong, but I can't figure it out. Any help would be appreciated.

-OS

Code: Select all

//Set IGNORESPACES to 1 to force script interpreter to ignore spaces.
//If using IGNORESPACES quote strings in {" ... "}
//Let>IGNORESPACES=1
// Generated by MacroScript WebRecorder 3.02
// Recorded on Friday, September 21, 2012, at 12:53 PM
//This toggles off tabbed browsing and stores the old preferences.
RegistryReadKey>HKEY_CURRENT_USER,Software\Microsoft\Internet Explorer\TabbedBrowsing,Enabled,oldTB
RegistryWriteKey>HKEY_CURRENT_USER,Software\Microsoft\Internet Explorer\TabbedBrowsing,Enabled,0
LibLoad>%SCRIPT_DIR%\IEAuto.dll,hIE
// Generated by MacroScript WebRecorder 3.02
// Recorded on Friday, September 21, 2012, at 03:53 PM
//Move the mouse cursor out of harm's way to avoid causing mouseover events to interrupt
MouseMove>0,0
Let>delay=1
//Enable automatic downloads to My Documents dir, modify path if required
IE_OnDownload>1,C:\Users\John\Downloads,ie_res
//Set timeout for ClickTag and FormFill to 10 seconds, increase if pages load more slowly
IE_SetTimeout>30,ie_res
IE_Create>0,IE[0]
IE_Navigate>%IE[0]%,www.asba.com/members/?id=11973861,ie_res
IE_WaitDocumentComplete>%IE[0]%,ie_res
let>TD23_SIZE=99999
//IE_GetTagLength>%IE[0]%,,TD,23,0,TD23_SIZE
IE_ExtractTag>%IE[0]%,,TD,23,0,TD23,ie_res
//Extracted text is in variable: TD23
messagemodal>TD23
//At end of script - set tabbed browsing state back (or not depending on prior setting)
RegistryWriteKey>HKEY_CURRENT_USER,Software\Microsoft\Internet Explorer\TabbedBrowsing,Enabled,oldTB
LibFree>hIE

User avatar
Marcus Tettmar
Site Admin
Posts: 7395
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Post by Marcus Tettmar » Sat Sep 22, 2012 8:10 pm

When I run this I get the following in the message box:

Code: Select all

Last updated: 8/22/2012 
Steve Abril, Guard Pro Protection Systems
 Member Company 
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

Office Smith
Newbie
Posts: 15
Joined: Fri Sep 21, 2012 5:17 pm
Location: Phoenix, AZ

Head scratcher

Post by Office Smith » Mon Sep 24, 2012 4:05 pm

Ok,

So then it appears that the code is functioning properly, but that I've got a rogue setting somewhere that is causing the problem.

I'll try checking that out this morning.

-OS

Office Smith
Newbie
Posts: 15
Joined: Fri Sep 21, 2012 5:17 pm
Location: Phoenix, AZ

Internet Explorer Zone Issue!

Post by Office Smith » Mon Sep 24, 2012 4:28 pm

So, I did a little testing, and it appears that even though I've setup the website in the "trusted" zone, it is still using the internet settings.

I have determined this by toggling the protected mode off of the internet zone, and suddenly all the code works for me.

However, It was always my understanding that the trusted zone security settings (on which the protected mode is off) should apply for sites assigned to it.

I tried http://www.asba.com instead of *.asba.com but no dice.

Anyone understand protected mode/zones better and can help me figure out how to use web recorder functions without taking protected mode off of the internet zone?

Office Smith
Newbie
Posts: 15
Joined: Fri Sep 21, 2012 5:17 pm
Location: Phoenix, AZ

The dirt on Protected mode

Post by Office Smith » Mon Sep 24, 2012 6:23 pm

Based on the link below, and specifically the text below the link, if I am understanding correctly, if IE opens in protected mode, and then you move to an unprotected site, it actually reopens IE.

This explains why it is erroring out, it is opening in protected mode, and then navigating to an unprotected site, which causes it to close the intital window and open a new window with the site selected.


http://msdn.microsoft.com/en-us/library ... #wpm_lnpmp

Launching and Navigating a Protected Mode Process

If your application uses CreateProcess to launch IE, it should call IELaunchURL on Windows Vista. This will ensure that your application gets the right return values and that IE launches in Protected mode for URLs whose zone has Protected Mode on. If you need to determine whether a specific URL will open in a low (Protected Mode) or a medium integrity IE process before launching IE, call IEIsProtectedModeURL. Note that a high integrity process with administrator privileges will launch a high integrity IE process with Protected Mode off. If you want to launch Protected Mode from your high integrity process, then first create a medium integrity process, which will launch your high integrity process and IE.

If your application launches Internet Explorer using CoCreateInstance and you need to continue controlling navigations after IE is launched, you can use IWebBrowser2 to navigate Internet Explorer programmatically. You can continue controlling navigations after IE is launched only if your application has the same integrity level as the IE process launched. After your application navigates to URL in a different integrity IE process, you can not perform additional navigations. You should make the IE frame visible after navigation.

The following example shows how you would do this in C++.
hr = CoCreateInstance(CLSID_InternetExplorer, NULL, CLSCTX_LOCAL_SERVER,
IID_IWebBrowser2,(LPVOID*)&pIWebBrowser2);
hr = pIWebBrowser2->Navigate(bstrUrl, &vEmpty, &vEmpty, &vEmpty, &vEmpty);
hr = pIWebBrowser2->put_Visible(VARIANT_TRUE);
The following example shows the JScript version.
var ie = new ActiveXObject("InternetExplorer.Application");
ie.Navigate("http://www.msn.com");
ie.visible = true;

User avatar
Marcus Tettmar
Site Admin
Posts: 7395
Joined: Thu Sep 19, 2002 3:00 pm
Location: Dorset, UK
Contact:

Post by Marcus Tettmar » Tue Sep 25, 2012 8:32 am

All I know about Protected Mode is that it needs to be switched off altogether to allow websites to be scripted (i.e. for WebRecorder to work). Sometimes this security option also needs setting:

Allow scripting of Microsoft web browser control - Enable

Some people have this as Disabled and therefore WebRecorder can't interact with the page.
Marcus Tettmar
http://mjtnet.com/blog/ | http://twitter.com/marcustettmar

Did you know we are now offering affordable monthly subscriptions for Macro Scheduler Standard?

Post Reply
cron
Sign up to our newsletter for free automation tips, tricks & discounts