EvenRank — Distributed Web Crawlers at Scale

Early-career work at EvenRank Data Science (2019): distributed web crawlers for LinkedIn profiles, Google Maps listings, and email discovery, with robust anti-bot handling, proxy rotation, and a MongoDB-backed job queue.

Backend Engineer (2019)

Jan 2019 – Dec 2019

What I built

Async Python crawler framework with coroutine-based scheduling
LinkedIn scraper (Selenium CLI + API) with login and session rotation
Google Maps business-listings extractor with de-dup across locales
Mail scraper with MX-record lookup + SMTP verification pipeline
Proxy rotation and user-agent fingerprinting across 500+ concurrent workers
MongoDB-backed distributed job queue with retry and dead-letter handling

Hard problems

Staying ahead of LinkedIn's anti-bot evolution across multiple quarters
Keeping extractors resilient to DOM changes without a fragile XPath soup
Verifying email validity at scale without burning sender reputation

Tech stack

PythonNode.jsSeleniumPuppeteerScrapyMongoDBRedisExpress

EvenRank — Distributed Web Crawlers at Scale

Backend Engineer (2019)

Jan 2019 – Dec 2019

What I built

Async Python crawler framework with coroutine-based scheduling
LinkedIn scraper (Selenium CLI + API) with login and session rotation
Google Maps business-listings extractor with de-dup across locales
Mail scraper with MX-record lookup + SMTP verification pipeline
Proxy rotation and user-agent fingerprinting across 500+ concurrent workers
MongoDB-backed distributed job queue with retry and dead-letter handling

Hard problems

Staying ahead of LinkedIn's anti-bot evolution across multiple quarters
Keeping extractors resilient to DOM changes without a fragile XPath soup
Verifying email validity at scale without burning sender reputation

Tech stack

PythonNode.jsSeleniumPuppeteerScrapyMongoDBRedisExpress

EvenRank — Distributed Web Crawlers at Scale

What I built

Hard problems

Tech stack

Tags

EvenRank — Distributed Web Crawlers at Scale

What I built

Hard problems

Tech stack

Tags