Roedy Green<see_website@mindprod.com.invalid> wrote in
news:b4uad7tkpp8lkjk07ko7512ae3es8slbsp@4ax.com:
<snip>
One trick I ran into was you could not just go to the page. You had to
go to the home page and navigate your way there picking up cookies as
you went.
One web site that I screenscrape forces me not only to pick up cookies along
the way, but also pick up the value of variables that are buried in
Javascript. Sometimes these variables are passed as a parameter in a POST,
and sometimes as part of a document window.
I use WebScarab to view how a browser handles the web page, and then I figure
out how to emulate this behavior in Java. The web page keeps evolving; the
authors seem to be finding increasingly arcane ways to make this hard. I
don't know that the authors are intentionally making this hard. It could be
that they are using tools that automate the process and that are evolving as
well.
Sometimes there are multiple redirects, with each redirect providing another
critical piece of data (cookie, POST parameter, document window variable) in
the chain.
Good Luck!
I have seen the same thing. There is an interface in Yahoo Small
(specifically, adding a new e-mail alias).