kerravonsen: 9th Doctor wearing his headlamp: Technical wizard (technical-wiz)
Kathryn A. ([personal profile] kerravonsen) wrote in [community profile] perl2011-08-02 10:27 am

Fan Fiction Fetcher!

For all of us here who are fannish as well as geeky, you might be interested in this.

I have written my own fan-fiction downloader in Perl, which can be installed from CPAN as "WWW::FetchStory". There are probably Linux-isms in the code. (frown) For example, it uses the "wget" program to do the actual downloading.

But I would love other people to use the script! It has plugins (which I am calling "fetchers") for various different fiction sites, which know how to download multi-chapter fics from those sites, so you only have to give the table-of-contents URL for the fic and it will figure out the rest. Depending on the particular fetcher, it will get not just the title and author, but the summary, the categories and the characters.
It also has an option to create an EPUB file rather than HTML files.

Currently, I have written fetchers for:

AO3: http://www.archiveofourown.org AO3 General fanfic archive
Ashwinder: (http://ashwinder.sycophanthex.com/) A Severus Snape/Hermione Granger HP fiction archive.
DigitalQuill: (http://www.digital-quill.org/) A Harry Potter fiction archive.
DracoAndGinny: (http://www.dracoandginny.com) A Draco Malfoy/Ginny Weasley HP fiction archive.
DreamWidth: (http://www.dreamwidth.org) Journalling site where some post their fiction.
FanfictionNet: (http://www.fantiction.net/) Huge fan fiction archive.
FictionAlley: (http://www.fictionalley.org/) A Harry Potter fiction archive.
HPAdultFanfiction: (http://hp.adultfanfiction.net) An adult Harry Potter fiction archive.
LiveJournal: (http://www.livejournal.com/) Journalling site where some people post their fiction.
Owl: (http://owl.tauri.org/) A Harry Potter fiction archive.
PetulantPoetess: (http://www.thepetulantpoetess.com/) A Harry Potter fiction archive.
PotionsAndSnitches: (http://www.potionsandsnitches.net) A Severus Snape + Harry Potter gen fiction archive.
PotterPlace: (http://www.potterplacearchives.com) A Harry Potter fiction archive.
SSHGExchange: (http://community.livejournal.com/sshg_exchange/) Severus Snape/Hermione Granger fiction exchange comm.
TardisBigBang3: (http://www.tardisbigbang.com/Round3/) Round 3 of the TARDIS BigBang challenge.
Teaspoon: (http://www.whofic.com) A Teaspoon And An Open Mind; a Doctor Who fiction archive.
TwistingHellmouth: (http://www.tthfanfic.org) Twisting The Hellmouth; Buffy The Vampire Slayer crossovers.

But every now and then, those sites change their code and the fetcher for that site breaks. (frown)

Also, for a number of those archives, you must be logged in if you want to download "adult" rated fic. The solution I devised for that is rather clumsy (and Linux-centric); it looks for a "cookies.txt" file in your home directory, which you need to have exported from your browser after you logged in to the site.
If someone has a better solution, I would love to hear from you.

For the more geeky among you, the source is in my git repository at https://github.com/rubykat/WWW-FetchStory
I would LOVE people to contribute to it, whether that be fixing bugs, fixing documentation, improving fetchers, or writing new fetchers.

Post a comment in response:

This account has disabled anonymous posting.
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting