Public Learning

project-centric learning for becoming a software engineer

Week 9: A Recap

The Fast And The Curious

With the data persistence layer no longer being an issue, I finally implemented all the basic core functionality for handling GET, POST, PUT and DELETE requests. Using the wonderful Insomnia REST client, I can now see the “semi-intelligent” request handling at work. It’s always nice to have some actual visual feedback, especially after doing lots of theoretical work over the course of the previous weeks.
It was interesting to see that the actual business logic (meaning the functionality surrounding the incoming requests, their processing and sending the appropriate response) was by far the fastest part to write. It helps that there are no truly challenging concepts to it: it’s CRUD to the bones. The difficult parts can be found elsewhere: in dealing with the uncertain, schema-less data that’s sent (see my database issues) and how to create an actual production-ready application.

I took the time last week to write some proper tests. It became a bit cumbersome to do manual requests of various types and with some dummy data a few times each day to confirm that there was no regression. Unfortunately, automating that wasn’t as straightforward as I would have liked. My experience with tests is limited to small unit tests for dependency-free small functions. That’s easy to do with just about every testing framework, but I quickly figured that there’s not too much use in unit testing individual functions or modules: they are already pretty small and won’t change a lot. The bigger risk comes from the interplay of all components and that missing a simple next() call in a new middleware might make the application not send the right response.
So I decided on doing mainly end-to-end tests: if the API responds correctly to requests, I can safely assume that all its parts are working as intended. I could have used Insomnia for that as well, but I wanted something that could be run from the command line and in combination with any future build and deploy steps. It took some time to figure out how to create a proper testing environment inside my Express app, so that a testing database is used (and filled with mock data) and a new Express instance gets created without any conflicts regarding existing processes. There is a very helpful library called Supertest that - together with the popular Jest testing framework - allows for instancing my backend server, connecting to a test database and checking all API functionality. Having a test suite that runs with a single yarn test inspires confidence for adding any additional functionality.

What is now missing is mainly the support for user-defined endpoint behavior which will cause the most work in the not yet started frontend portion. In addition to that, the backend server is only missing security and compatibility features like proper CORS handling and rate limiting. I’m still not completely sure whether to rely on third-party frameworks for those or to at least write parts of it myself. This is a fundamental question anyway: how to identify what to focus my coding efforts on. It’s possible to plug a ton of external middleware into Express because it seems like there are at least a dozen npm packages for every conceivable piece of functionality.
So far, I have had the rule of using third-party libraries only for parts that are too big to write myself (resulting in the two dependencies for Express and the MongoDB driver for Node.js) or too irrelevant to my actual efforts to devote time to them (so far this has been the case twice, causing my other two dependencies: dotenv for loading the configuration and morgan for logging). By the way, if you think that this sounds reasonable (it’s only four dependencies after all), my node_modules folder still holds over 450 external package dependencies - admittedly, this includes the heavy development dependencies (like Jest) as well, but it’s still a good demonstration of the atomicity of the JS package ecosystem. No judgement, just an observation.
I will likely resort to integrate everything related to CORS myself: I want to have full control over the request and response header handling anyway and have already created a barebones handling for OPTIONS requests (which is what gets sent as a pre-flight request for POST requests and some GET requests by browsers to see if CORS is supported).
Rate limiting, on the other hand, is probably going to be handled by a third-party package with simple in-memory management. I’m not running Node in cluster mode or any other distributed fashion, so plugging in a simple in-memory rate limiter with a proven track record will be a sane choice. I don’t consider rate limiting a core part of my system, it’s mainly there to prevent any DDoS-like request storm, either through malice or mishandling (see my experiences with querying the Hacker News API a few weeks ago where I swamped them with dozens of requests per second). That is ultimately my guideline when deciding what to implement on my own and where to reach for external code: the closer it is to my core product, the more I will lean towards doing it myself.

You Build It, You Run It

I’m still far from having a complete product, but last week an interesting question has come up: how will I actually deploy everything? I am now close to being able to have other people use the basic feature set and I’m eager to receive feedback and bug reports from a few testers. In order to do that, I need a system to execute both a build and a deployment step, something I have never done so far.
My current idea is as follows: my frontend will most likely be served by a web server that was built for doing just that (in comparison to Node.js/Express, which should not concern itself with delivering static assets). The frontend then uses AJAX to talk to a small internal API for configuration purposes.
Since I will rely on something like NGINX for static content anyway, I can also use it as a reverse proxy for all incoming external API requests (and maybe use it as a SSL termination point as well).
So my whole infrastructure will comprise four parts: 1. the Node.js/Express backend API server (handling all external requests as well as the API calls from the administrative frontend) 2. the MongoDB database providing persistence for the backend data 3. the frontend, which will most likely be written in HTML/CSS/JavaScript/React 4. the NGINX reverse proxy and static asset server

I already have a Hetzner cloud server running and will most likely use it to deploy everything there. It’s cheap, reliable and flexible enough that I can add enough RAM and virtual CPUs to not create a performance bottleneck on the server side of things.

There is also a build step for at least one of the aforementioned parts (the frontend). I could still follow a manual procedure every time I want to deploy a new version of 200 OK: (re-)configure both MongoDB and NGINX, clone the repo on the server, install dependencies, build the frontend, run backend tests and then use a process manager like PM2 to start the Node application.
This might be sufficient enough, but it is in no way how it would be done in the real world. Plus, it’s cumbersome, involving many different commands and therefore creating ample opportunity to make mistakes.
So I started to research best practices and investigate how I can automate all those steps. Naturally, Docker is an enticing option. Together with a shell script, I would be able to simple create Docker images for NGINX, my Node app and the MongoDB. With Docker Compose I can automate the whole deployment step and even create the possibility to locally test in an environment identical to the production one.
I’m not done with my research on all implementation details, but have started preparing both my application to be used behind the reverse proxy as well as configuring NGINX. The latter is something that’s also new to me, so I’m still fumbling around with the configuration file to forward what needs to be forwarded and serve evry static file directly.
Together with having a proper test site, I now finally feel like I am not just doing an amateur project but something that is starting to resemble a “real” application.

Summary

👍

Time spent this week: 46 hours