tvl-depot/fun/amsterdump/scrape.el
Vincent Ambo 7b77e9986c feat(fun/amsterdump): Add distance matrix lookup for fundu results
This contains a little tool that can make requests to the Google Maps
API for distance matrix lookups from Fundu results to Schiphol Airport
and Amsterdam Centraal.

<3 edef!
2020-01-05 21:10:37 +00:00

25 lines
838 B
EmacsLisp

;; Scraping funda.nl (this file is just notes and snippets, not full code)
;;
;; Begin by copying whole page into buffer (out of inspect element
;; because encoding is difficult)
(beginning-of-buffer)
;; zap everything that isn't a relevant result
(keep-lines "data-object-url-tracking\\|img alt")
;; mark all spans, move them to the end of the buffer
(cl-letf (((symbol-function 'read-regexp)
(lambda (&rest _) "</span>")))
(mc/mark-all-in-region-regexp (point-min) (point-max)))
;; mark all images lines (these contain street addresses for things
;; with images), clear up and join with previous
;;
;; mark all: data-image-error-fallback
;; delete all lines that don't either contain a span or an img tag
;; (there are duplicates)
(keep-lines "span class\\|img alt")
;; do some manual cleanup from the hrefs and done