The environment you can implement a simple crawling application right now with Puppeteer, TypeScript and DB (TypeORM) on Docker
- Install dependencies (TypeScript related packages are required also in local for coding)
$ npm i
- Write your awesome crawling codes in
src/crawl.ts
if you need to store data, define entities as well. see TypeORM. connection settings are already done.
- Run crawling
$ docker-compose run app
- Extract data
if necessary, extract data from postgres container.
$ docker-compose exec postgres psql -P pager=off -U postgres -c "select * from table;" -A -F $'\t' | sed '$d' > result.tsv