A metadata scraper for the https://myrunningman.com/ website.
Go to file
mappu af0a0e5aba doc/readme: truncate sample 2022-08-08 19:44:48 +12:00
.gitignore gitignore, update tagline 2022-08-08 19:44:03 +12:00
README.md doc/readme: truncate sample 2022-08-08 19:44:48 +12:00
download-pages.sh initial commit 2022-08-08 19:36:19 +12:00
download-thumbs.sh initial commit 2022-08-08 19:36:19 +12:00
running-parser.php initial commit 2022-08-08 19:36:19 +12:00

README.md

myrunningmancom-scraper

A metadata scraper for the https://myrunningman.com/ website.

Usage

  1. ./download-pages.sh to download HTML files only once
  2. ./download-thumbs.sh to collect thumbnails (optional)
  3. ./running-parser.php to parse HTML into final output.json data file

Example output

{
    "1": {
        "title": "Times Square",
        "broadcast_date": "2010-07-11",
        "filming_date": "2010-06-21",
        "location": "Times Square (Yeongdeungpo-gu, Seoul)",
        "description": "A never-before-seen action variety show with an amazing cast. To start off the first episode[...]