4chan and especially /b are both great places on the web to just sit back and relax after a hard day’s work by browsing through a sea of endless brain degrading material. It’s the perfect way of spending your free time (the time where mom doesn’t make you do the dishes), am I right? Trouble is, 4chan wipes all of its threads very quickly so if you miss your favorite thread- it’s gone for good (unless you create it again).
4chan downloader for threads & images
By combining my superior technical skills with the php simple html dom parser, I was able to come up with a pretty neat php script which will go to 4chan, search for your specified threads and download the threads as a whole along with all of its posts and images right to your php 5+ server.
All you need to do is specify which keywords to look for in the settings.php file and create a cron job to launch pull.php to visit 4chan every x minutes. So, for example, if you wanted the 4chan image downloader to visit You Laugh You Lose threads, you’d simply type “ylyl” as the $wordToFind variable- it couldn’t be more simple!
I have made the script open source and available for everyone on Github so if you understand code, feel free to make something cool based on my 4chan threads crawler script. There are a couple of things on my to-do list and I have written more about it on the Github page.
You Laugh You Lose 4Chan Archive
The You-Laugh-You-Lose threads of 4chan were what inspired me to create the 4chan you laugh you lose archive and the crawler has worked out quite nicely by pulling all the ylyl threads it can find directly to my server.
It’s a bit of a gamble to actually find a thread because my host is an ass and only allows a cron job to run every hour (minimum). Of course I could visit pull.php manually when I see a YLYL thread, but that kind of loses the purpose of my script as the point was to help me get a dose of ylyl when I have been away.
We all know how 4chan can be from time to time and since I haven’t figured out a way to filter out “certain” images that people enjoy to post and get banned for, my script could (in theory) get you in trouble for saving certain pictures to your server.
That being said, I hear that Google has an algorithm for that in Gmail so it’s certainly possible to come up with a filter of some sort. I guess it needs more research as I’m not all that familiar with image recognition software just yet.