A metadata scraper for the https://myrunningman.com/ website.
Go to file
mappu e641690eff gitignore, update tagline 2022-08-08 19:44:03 +12:00
.gitignore gitignore, update tagline 2022-08-08 19:44:03 +12:00
README.md gitignore, update tagline 2022-08-08 19:44:03 +12:00
download-pages.sh initial commit 2022-08-08 19:36:19 +12:00
download-thumbs.sh initial commit 2022-08-08 19:36:19 +12:00
running-parser.php initial commit 2022-08-08 19:36:19 +12:00

README.md

myrunningmancom-scraper

A metadata scraper for the https://myrunningman.com/ website.

Usage

  1. ./download-pages.sh to download HTML files only once
  2. ./download-thumbs.sh to collect thumbnails (optional)
  3. ./running-parser.php to parse HTML into final output.json data file

Example output

{
    "1": {
        "title": "Times Square",
        "broadcast_date": "2010-07-11",
        "filming_date": "2010-06-21",
        "location": "Times Square (Yeongdeungpo-gu, Seoul)",
        "description": "A never-before-seen action variety show with an amazing cast. To start off the first episode, they head over to the T shopping mall in Seoul after closing hours. They will be split into two different teams and compete against each other for the passcode that will let them escape from the mall. From running around to find clues to ripping each other's name tag, no one can predict what will happen to them. Stay tuned to see which team emerges victorious.",