code.ivysaur.me/myrunningmancom-scraper

T

mappu04 ae7319292d makefile: initial commit

2023-06-15 19:49:30 +12:00

.gitignore

gitignore, update tagline

2022-08-08 19:44:03 +12:00

download-pages.sh

combine download-pages and download-thumbs; stash max_episode file

2023-06-15 19:49:04 +12:00

Makefile

makefile: initial commit

2023-06-15 19:49:30 +12:00

pages-to-json.php

json: rename file, always work up to max_episode

2023-06-15 19:49:23 +12:00

README.md

doc/readme: truncate sample

2022-08-08 19:44:48 +12:00

README.md

myrunningmancom-scraper

A metadata scraper for the https://myrunningman.com/ website.

Usage

./download-pages.sh to download HTML files only once
./download-thumbs.sh to collect thumbnails (optional)
./running-parser.php to parse HTML into final output.json data file

Example output

{
    "1": {
        "title": "Times Square",
        "broadcast_date": "2010-07-11",
        "filming_date": "2010-06-21",
        "location": "Times Square (Yeongdeungpo-gu, Seoul)",
        "description": "A never-before-seen action variety show with an amazing cast. To start off the first episode[...]