What's the best way to download all images / webms / mp4s from a Tumblr blog?
I'm looking to download all the posts / images / videos from some Tumblr blogs, and they hyperlink gfycat / webm versions in the body of the post, which Tumblripper / BulkImageDownloader / other Tumblr image downloaders don't catch. I think it's a problem with the fact they're hyperlinked in the body and not actually "on" Tumblr.
Anyone know of a good solution to download everything from a Tumblr blog? I've also tried wget and httrack but they don't seem to work.
I would prefer to use a program with a GUI to do what I need to do, as opposed to a command lined based program since I barely know how to work them. It took me too long to figure out wget and I don't have the time to learn another one to download Tumblr blogs.
答案 0 :(得分:0)
I understand that you are averse to command line tools, however i would personnally use curl to write the page source to a file:
curl www.tumblr.com/something > outfile.html
Then you can parse the file in whatever language you are comfortable with. This answer has some excellent suggestions on how to do that with grep: https://unix.stackexchange.com/questions/181254/how-to-use-grep-and-cut-in-script-to-obtain-website-urls-from-an-html-file
such as this one:
$ curl -sL https://www.google.com | grep -Po '(?<=href=")[^"]*(?=")'
/search?
Which gives you:
https://www.google.co.in/imghp?hl=en&tab=wi
https://maps.google.co.in/maps?hl=en&tab=wl
https://play.google.com/?hl=en&tab=w8
https://www.youtube.com/?gl=IN&tab=w1
https://news.google.co.in/nwshp?hl=en&tab=wn
...