You are not logged in.

1

Sunday, October 9th 2005, 12:38am

Reading data from an asp file with DCOP?

Hi all,
recently I had to download a lot of data from Internet from an asp file.
The problem is that to download the data I had to click 220 times on a button "Page Down", and each time cut & paste the relevant data.
Such data are indeed scattered on 220 pages, reachable only from inside the browser (I use konqueror, but I think that the same happens with Mozilla or Firefox and perhaps IE).
I posted this problem in another forum, but now I am posting here because I think that the only solution is to use DCOP and here is the right place.
Active Server Pages (asp) is a proprietory scripting language from MS, running on a web server (generally with IIS, Internet Information Server) and each browser makes only the script to run on the server and every time I click on the "Page Down" button the script prepares a new html page (after having accessed its internal db for the new asked data).
I have no access to the remote server. I hoped to solve the problem with wget which has a kind of browser simulation (only for the downloading side) but wget does not know how to deal with asp files; it merely downloads the 1st page (of 220) as an html file.
I already have the data (after almost 2 hours of "by hand" approach), but I should prefer to better spend my time (programming for example a script that does the job for me...).
Following this approach I am trying is to use DCOP (Desktop COmmunication Protocol). As a matter of fact I use (also for this reason) a KDE browser (konqueror) and it should be enough to send through a script simulated keys (tab, Enter , Alt + other keys) to the application for shifting the focus on Page Down button, than clicking it by sending '\n'. After that you must access Save As by Alt+'a' and at last save the html page sent by the server using incremented filenames like 001.html, 002.html...220.html.
This is the most difficult part because there is not much documentation on DCOP and, besides that, there are very few applicative examples of DCOP. I must study and make some test, using DCOP-Python bindings (I don't know C++).

Can you help me?
Bye.

anda_skoa

Professional

Posts: 1,273

Location: Graz, Austria

Occupation: Software Developer

  • Send private message

2

Sunday, October 9th 2005, 4:14pm

I think that doing this directly in a program would work better, i.e. using a Python HTTP module and requesting the initial page and then sending the request the button would send.

Cheers,
_
Qt/KDE Developer
Debian User

3

Monday, October 10th 2005, 5:12pm

Thank you for your replay, but your suggestion is not applicable (IMHO !).
The fact is that I don't see any change in the URL string displayed as a consequence of button clicking.
So I don't know what request to send. It's a matter of a cracker to get such data...
Apart from that, I don't want to write a specific program for each case.
For example the approach I suggested could be used to make automatic searches with web search engines. In fact emulating the user you can avoid security traps set for search spiders (blocks if you are not querying from a browser, or are making too many queries, indicating the presence of a program acting not a human being, etc...).
With python there are 2 modules for dcop (dcop and pcop). The last one has interesting functions:
- app_list() DCOP registered applications (like dcop from shell)
- obj_list() list of objects for a DCOP application
- method_list() list of methods for a DCOP object.
With such functions I want to write a script to list the trees of DCOP. And you don't need to run an application for it to appear in the lists because you can register it with the function "register_as().
The problem I am encountering is the lack of documentation and examples.
Bye.

anda_skoa

Professional

Posts: 1,273

Location: Graz, Austria

Occupation: Software Developer

  • Send private message

4

Monday, October 10th 2005, 11:18pm

Quoted

Originally posted by Bernardo Simonini
Thank you for your replay, but your suggestion is not applicable (IMHO !).
The fact is that I don't see any change in the URL string displayed as a consequence of button clicking.
So I don't know what request to send. It's a matter of a cracker to get such data...


A button usually triggers a HTTP get or HTTP post request which usually the browser sends, but which of course can also be sent by any other program that understands HTTP.

Script languages like Perl, Python and Ruby have usually HTTP modules for this.

Quoted


With python there are 2 modules for dcop (dcop and pcop). The last one has interesting functions:
- app_list() DCOP registered applications (like dcop from shell)
- obj_list() list of objects for a DCOP application
- method_list() list of methods for a DCOP object.
With such functions I want to write a script to list the trees of DCOP. And you don't need to run an application for it to appear in the lists because you can register it with the function "register_as().
The problem I am encountering is the lack of documentation and examples.


I am not sure how the Python Qt community is organized but they have definitely some kind of medium that is better than this KDE user forum :)

Very likely it is easier to write a PyKDE program that embeds a KHTML part and injects some JavaScript that activates the button.

Cheers,
_
Qt/KDE Developer
Debian User

5

Tuesday, October 11th 2005, 12:35pm

Quoted

A button usually triggers a HTTP get or HTTP post request which usually the browser sends, but which of course can also be sent by any other program that understands HTTP.

In theory I know that, but I am not an expert like you and I am not able to handle the "submit" type of button I've seen in the html file received by the server.
Besides that I don't want to learn, because it is specific of html treatment or, more generally, of web browsers.
Along the same line of thought there is your suggestion:

Quoted

Very likely it is easier to write a PyKDE program that embeds a KHTML part and injects some JavaScript that activates the button.


When I worked under Windows, there were plenty of "macro" languages. I used AutoIt, but there are also Macro Express, Macro Scheduler or even VB macros under Office.
This approach is "general", and I hoped to find it in DCOP, but if you say that is impossible to send keystrokes to a KDE application, I must believe you, as you are one of the developpers of KDE...
Olaf M. Zanger (a colleague of you?) even speaks of "macro recording"...
Bye.

anda_skoa

Professional

Posts: 1,273

Location: Graz, Austria

Occupation: Software Developer

  • Send private message

6

Tuesday, October 11th 2005, 5:45pm

Quoted

Originally posted by Bernardo Simonini
In theory I know that, but I am not an expert like you and I am not able to handle the "submit" type of button I've seen in the html file received by the server.

I am no expert on that topic (HTTP) either, just know that some people do it that way :)

Quoted


This approach is "general", and I hoped to find it in DCOP, but if you say that is impossible to send keystrokes to a KDE application, I must believe you, as you are one of the developpers of KDE...

DCOP is for communicating with applications and using whatever interface they make available through DCOP.

"Clicking" arbitrary buttons is more a matter of simulation a user, this might be possible through other means.

That said, it might be possible to trigger a submit through Konqueror's DCOP interface, it depends what functionality it exports that way.

Additionally, as I said in my last posting, embedding a KHTML part in your own application, languagse doesn't matter as long as there are bindings for it, gives you even more control, as you can directly access the DOM tree and, AFAIK, even inject JavaScript

Cheers,
_
Qt/KDE Developer
Debian User