kerravonsen: 9th Doctor wearing his headlamp: Technical wizard (technical-wiz)
[personal profile] kerravonsen posting in [community profile] perl
For all of us here who are fannish as well as geeky, you might be interested in this.

I have written my own fan-fiction downloader in Perl, which can be installed from CPAN as "WWW::FetchStory". There are probably Linux-isms in the code. (frown) For example, it uses the "wget" program to do the actual downloading.

But I would love other people to use the script! It has plugins (which I am calling "fetchers") for various different fiction sites, which know how to download multi-chapter fics from those sites, so you only have to give the table-of-contents URL for the fic and it will figure out the rest. Depending on the particular fetcher, it will get not just the title and author, but the summary, the categories and the characters.
It also has an option to create an EPUB file rather than HTML files.

Currently, I have written fetchers for:

AO3: AO3 General fanfic archive
Ashwinder: ( A Severus Snape/Hermione Granger HP fiction archive.
DigitalQuill: ( A Harry Potter fiction archive.
DracoAndGinny: ( A Draco Malfoy/Ginny Weasley HP fiction archive.
DreamWidth: ( Journalling site where some post their fiction.
FanfictionNet: ( Huge fan fiction archive.
FictionAlley: ( A Harry Potter fiction archive.
HPAdultFanfiction: ( An adult Harry Potter fiction archive.
LiveJournal: ( Journalling site where some people post their fiction.
Owl: ( A Harry Potter fiction archive.
PetulantPoetess: ( A Harry Potter fiction archive.
PotionsAndSnitches: ( A Severus Snape + Harry Potter gen fiction archive.
PotterPlace: ( A Harry Potter fiction archive.
SSHGExchange: ( Severus Snape/Hermione Granger fiction exchange comm.
TardisBigBang3: ( Round 3 of the TARDIS BigBang challenge.
Teaspoon: ( A Teaspoon And An Open Mind; a Doctor Who fiction archive.
TwistingHellmouth: ( Twisting The Hellmouth; Buffy The Vampire Slayer crossovers.

But every now and then, those sites change their code and the fetcher for that site breaks. (frown)

Also, for a number of those archives, you must be logged in if you want to download "adult" rated fic. The solution I devised for that is rather clumsy (and Linux-centric); it looks for a "cookies.txt" file in your home directory, which you need to have exported from your browser after you logged in to the site.
If someone has a better solution, I would love to hear from you.

For the more geeky among you, the source is in my git repository at
I would LOVE people to contribute to it, whether that be fixing bugs, fixing documentation, improving fetchers, or writing new fetchers.
Anonymous( )Anonymous This account has disabled anonymous posting.
OpenID( )OpenID You can comment on this post while signed in with an account from many other sites, once you have confirmed your email address. Sign in using OpenID.
Account name:
If you don't have an account you can create one now.
HTML doesn't work in the subject.


Notice: This account is set to log the IP addresses of everyone who comments.
Links will be displayed as unclickable URLs to help prevent spam.


perl: cc-by-nc (Default)
Pathologically Eclectic Rubbish Lister

August 2012

56 7891011

Style Credit

Expand Cut Tags

No cut tags
Page generated Oct. 23rd, 2017 04:22 am
Powered by Dreamwidth Studios