ZEALS TECH BLOG

ZEALS Developers Blog

Browser Automation at Zeals

Intro

Lately, there has been so much movement at ZEALS. As we've expanded, we've been able to meet so many new clients. This has been both a tremendous boon for us as a company, and presented both interesting and tough problems for us on the engineer team to solve. As the our amount of clients increase, so does the amount of data. And as the data increases, our approach to handling said data must be a scalable solution, to free our hands to allow us to focus on bigger and better things. Recently, people would have to manually fill in web forms, which was draining worker's time. One strategy to increase our productivity is to use a headless browser to automate the task. This was suggested to me by our technology lead, and I went ahead and did the implementation, so I'd like to do a sort of introduction as to how I did it.

So many choices...

We needed a server that would be able to serve multiple requests to fill forms concurrently, would have a fairly simple API that would allow you to have the server open a headless web browser, take care of the form and close down. In the end, I decided that Node.js would be a great candidate. Not that I hadn't heard of Python/Selenium solutions before, but there are plenty of headless browser solutions for Node.js, and I was comfortable with Node, so it just seemed like a decent choice. Within the Node ecosystem, there are choices to be made. The top choices for using JS with a headless browser are:

  1. PhantomJS
  2. CasperJS
  3. Puppeteer

However, there are small details to be considered. While all of these choices are technically controlled by JS, what you get is slightly different.

PhantomJS is "a headless WebKit scriptable with Javascript", in other words it's not Node.js, it's a headless browser with a JS API. Not necessarily available from a Node.js process. CasperJS is a "scripting utility written in Javascript for PhantomJS or SlimerJS", meaning it relies on either one of those headless browser.

On the other hand, we have Puppeteer which "is a Node library which provides a high-level API to control headless Chrome over the DevTools protocol". Given that it was the best choice for what I wanted to do, has the most stars on Github, is provided by the Chrome team, AND has great documentation and a modern API, I had my winner. I can't stress this enough, Puppeteer really blows the competition out of the water. If you have a decent knowledge of async/await in ES7 and are aware of which scripts are being executed in the headless browser, as opposed to Node, then this is really just a joy to use.

On top of that, the fact that I'd would simply be able to require the library in my Node server just...almost brought a tear to my eye. :D

The Implementation

f:id:zeals-engineer:20180917114554p:plain

I won't try to bore you with too many details, but here's the gist of the server:

  • Express for the API
  • Puppeteer for headless browser

And here's a gist of what the server does:

  1. The server would except POSTs with JSON sets of instructions on where the form was, how to fill the form, and what was the indication the form was done being sent
  2. The server parses the instructions from the requester, if they seem to be valid, then continue
  3. A headless browser is opened with Puppeteer, if the browser gets the requested page, then the Express server sends a 200 back to the requester
  4. However, the server's not done yet. It goes through all the instructions, and whether it is successful or not, it logs the details

And to be honest, this was looking back now, not too hard to write. However, there are some gotchas that I think people should be aware of:

Gotchas

1. Headless browsers are browsers.

In Puppeteer's case, it's Chromium. And it does everything that that Chromium does. I mean EVERYTHING. It may seem obvious, but there were some weird things I had to learn. For example, a had one case where sending a webform caused an alert box to popup. Of course, since it's headless, I wasn't able to see that. I just assumed the script was hanging for some strange reason. It wasn't until I turned headless mode off (you can do that), that I realized my mistake. You can handle these sort of events with code, but you have to handle them! My advice is to debug with headless mode turned off, and learn to use events well.

2. Just because you have Puppeteer, doesn't mean you have Chromium.

Of course developing using Puppeteer was simple. I already had the Chromium process on my PC. When it comes time to deploy, you need to make sure whatever server/virtual machine/docker instance you're using has the proper software on it for this work. I went ahead and used docker and if you're interested to see how that works, you can check the official Puppeteer docs!

https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md#running-puppeteer-in-docker

See I told you the docs were great.

Closing

Browser automation has been a blast to do, and I hope I can do a bit more in the future and it remains part of our toolset. It's a bit hard to wrap your head around, but when it all comes together it really is beautiful!

Until next time!

- Aaron