Learn in-depth how to use two of the most popular Node.js libraries for controlling a headless browser - Puppeteer and Playwright.
A headless browser is just a regular browser like the one you're using right now, but without the user-interface. Because they don't have a UI, they generally perform faster as they don't render any visual content. For an in-depth understanding of headless browsers, check out this short article about them.
Both packages were developed by the same team and are very similar, which is why we have combined the Puppeteer course and the Playwright course into one super-course that shows code examples for both technologies. There are some small differences between the two, which will be highlighted in the examples.
Each lesson's activity will contain examples for both libraries, but we recommend using Playwright, as it is newer and has more features and better documentation
When automating a headless browser, you can do a whole lot more in comparison to just making HTTP requests for static content. In fact, you can programmatically do pretty much anything a human could do with a browser, such as clicking elements, taking screenshots, typing into text areas, etc.
Additionally, since the requests aren't static, dynamic content can be rendered and interacted with (or, data from the dynamic content can be scraped).
For this course, we'll be jumping right into the features of these awesome libraries and expecting you to already have an environment set up. Here's how we set up our environment:
- Make sure you've installed Node.js
- Create a new folder called puppeteer-playwright (or whatever you want to call it)
- Run the command
npm init -ywithin your new folder to automatically initialize the project
"type": "module"to the package.json file
- Create a new file named index.js
- Install the library you're going to be using during this course:
npm install playwright
For a more in-depth guide on how to set up the basic environment we'll be using in this tutorial, check out the Computer preparation lesson in the Web scraping for beginners course
- Launching a browser
- Opening a page
- Executing scripts
- Reading & intercepting requests
- Using proxies
- Creating multiple browser contexts
- Common use cases
In the first lesson of this course, we'll be learning a bit about how to create and use the Browser object.