A metadata scraper for the https://myrunningman.com/ website.
mappu04 ae7319292d | ||
---|---|---|
.gitignore | ||
Makefile | ||
README.md | ||
download-pages.sh | ||
pages-to-json.php |
README.md
myrunningmancom-scraper
A metadata scraper for the https://myrunningman.com/ website.
Usage
./download-pages.sh
to download HTML files only once./download-thumbs.sh
to collect thumbnails (optional)./running-parser.php
to parse HTML into final output.json data file
Example output
{
"1": {
"title": "Times Square",
"broadcast_date": "2010-07-11",
"filming_date": "2010-06-21",
"location": "Times Square (Yeongdeungpo-gu, Seoul)",
"description": "A never-before-seen action variety show with an amazing cast. To start off the first episode[...]