code.ivysaur.me/myrunningmancom-scraper

Go to file

mappu04 edf934c6a6 combine download-pages and download-thumbs; stash max_episode file

2023-06-15 19:49:04 +12:00

.gitignore

gitignore, update tagline

2022-08-08 19:44:03 +12:00

download-pages.sh

combine download-pages and download-thumbs; stash max_episode file

2023-06-15 19:49:04 +12:00

README.md

doc/readme: truncate sample

2022-08-08 19:44:48 +12:00

running-parser.php

initial commit

2022-08-08 19:36:19 +12:00

README.md

myrunningmancom-scraper

A metadata scraper for the https://myrunningman.com/ website.

Usage

./download-pages.sh to download HTML files only once
./download-thumbs.sh to collect thumbnails (optional)
./running-parser.php to parse HTML into final output.json data file

Example output

{
    "1": {
        "title": "Times Square",
        "broadcast_date": "2010-07-11",
        "filming_date": "2010-06-21",
        "location": "Times Square (Yeongdeungpo-gu, Seoul)",
        "description": "A never-before-seen action variety show with an amazing cast. To start off the first episode[...]