kerravonsen: animated sequence of geeks with the word "geek" around them (geek-anim)
[personal profile] kerravonsen posting in [community profile] perl
For those of you who were interested in my fanfic-fetching perl script, I've just released version 0.16 of WWW-FetchStory (http://search.cpan.org/~rubykat/WWW-FetchStory-0.16/) (well, that will be the URL when CPAN finishes processing it).

The big news: it no longer depends on wget! It uses the LWP perl module instead. This means that MS-Windows users should be able to use the script (fingers crossed).
I have retained the option to use wget, because some sites work with wget that don't work with LWP.(*)

There are a bunch of other improvements, and another new fetcher (Project Gutenberg), but the LWP stuff is the important bit.

(*) I have spent HOURS trying to get LWP + Cookies to work with LiveJournal, but no joy, and I have given up. LWP and Cookies work with other sites (I tried it on Ashwinder) but not with LJ. (throws hands in air) Anyone who can figure out why the cookies sometimes work and sometimes don't, that would be great. I have pored over debugging output, I have made observations with wireshark... The only difference seems to be that wget sends the right cookies and LWP only sends some of the right cookies.

Date: 2011-09-10 01:27 am (UTC)
foxfirefey: A guy looking ridiculous by doing a fashionable posing with a mouse, slinging the cord over his shoulders. (geek)
From: [personal profile] foxfirefey
Hmm. Does it work with DW but not LJ? I have a guess but I'm not sure, and if it's true it'd probably be the same for both platforms.

Date: 2011-09-10 02:27 am (UTC)
afuna: Cat under a blanket. Text: "Cats are just little people with Fur and Fangs" (Default)
From: [personal profile] afuna
Hmm, could it have something to do with adult content?

OH! Also, have you viewed the LJ jounal with the browser you exported the cookies form? There's a per-journal cookie that needs to be set (unless that's what you meant by per-session cookie)

Date: 2011-09-10 07:10 pm (UTC)
afuna: Cat under a blanket. Text: "Cats are just little people with Fur and Fangs" (Default)
From: [personal profile] afuna
Yes, I have, and yes it is.

Ugh, yeah I'm stumped :-(

Date: 2011-09-13 05:06 am (UTC)
pne: A picture of a plush toy, halfway between a duck and a platypus, with a green body and a yellow bill and feet. (Default)
From: [personal profile] pne
I don't know why wget splits the URL into "Host" and the rest. I don't know if that is significant or not.

That makes me wonder whether LWP is connecting through a proxy: the only case I've seen where GET uses a full URL is if you're telling a proxy, rather than the final server, which URL to get (the direct case uses only the partial URL on the GET line and the hostname in a separate Host: header - at least with HTTP 1.1).

...wait, LWP is using HTTP 0.9? That seems wrong, too. I would have expected "HTTP/1.0" or (preferably) "HTTP/1.1" at the end of the "GET" line.

Is this debugging output from LWP or is it the "real thing" (captured from the network somehow, for example)? Perhaps it's just LWP trying to be helpful in its debugging output and not telling you the exact string its sending?

Date: 2011-09-11 02:47 pm (UTC)
afuna: Cat under a blanket. Text: "Cats are just little people with Fur and Fangs" (Default)
From: [personal profile] afuna
Hmm, maybe -- it might want, say, "www.tptigger.livejournal.com" in order to match (note that that URL probably won't work).

Any way to test?

Date: 2011-09-12 09:32 am (UTC)
afuna: Cat under a blanket. Text: "Cats are just little people with Fur and Fangs" (Default)
From: [personal profile] afuna
Sad, but I'm glad it worked out for you.

Date: 2011-09-13 05:07 am (UTC)
pne: A picture of a plush toy, halfway between a duck and a platypus, with a green body and a yellow bill and feet. (Default)
From: [personal profile] pne
Note that livejournal domain is stored as ".tptigger.livejournal.com"

That does seem off to me - as if the dot before "tptigger" is required in a URL for the cookie to match, and there is no dot before it in http://tptigger.livejournal.com .

Date: 2011-09-10 03:29 pm (UTC)
kareila: (Default)
From: [personal profile] kareila
The one time I scripted something similar to this, I prompted interactively for the LJ password and had the script do its own login. The resulting cookie worked with LWP for my own purposes (exporting locked posts from my own journal). However, that was around 2004 or so and LJ has probably changed quite a bit since then.

Perhaps they have done something that specifically blocks the LWP user agent. If that's the case, you could see if changing the user agent string helps. Good luck!

Date: 2012-03-20 04:45 pm (UTC)
brownbetty: (Default)
From: [personal profile] brownbetty
I just spent 38 hours trying to figure out how to make perl give me a nice epub from lj, only to discover you have already done so.

Which is super, as my fluency with perl is about on the level "Oh shit, hashes contain *references* to arrays? You can do that?"

Profile

perl: cc-by-nc (Default)
Pathologically Eclectic Rubbish Lister

August 2012

S M T W T F S
   1234
56 7891011
12131415161718
19202122232425
262728293031 

Style Credit

Expand Cut Tags

No cut tags
Page generated May. 23rd, 2025 04:42 am
Powered by Dreamwidth Studios