2018-12-19

Creating Reusable Code

Create reusable software is challenging, especially when that software may be reused in situations or scenarios for which it may not necessarily have been designed. We’ve all had that meeting where a boss or manager asked the question: “What you’ve designed is great, but can we also use it here?”

In the last month I’ve had this exact experience, from which I’ve learned a number of valuable lessons about crafting reusable software.

eTexts and annotations

When I first started working for eNotes, my initial task was to fix some code related to electronic texts that we displayed on our site (e.g., Shakespeare, Poe, Twain, etc.). We have a significant collection of annotations for many texts, and those annotations were displayed to users when highlights in the text were clicked. A couple of years ago we spun this technology off into a separate product, Owl Eyes, with additional teacher tools and classroom management features. Because of my experience with the existing eText and annotation code, and because I am primarily responsible for front-end JavaScript, I was tasked with building a “Kindle-like” experience in the browser for these eTexts. (This is one of the highlights of my career. The work was hard, and the edge cases were many, but it works very well across devices, and has some pretty cool features.)

Filtering, serializing, and fetching annotation data

The teacher and classroom features introduced some additional challenges that were not present when the eText content was first hosted on enotes.com. First, classrooms had to be isolated from one another, meaning that if a teacher or student left an annotation in an eText for their classroom, it would not be visible to anyone outside the classroom. Also, a teacher needed the ability to duplicate annotations across classrooms if they taught multiple courses with the same eText. Eventually we introduced paid subscriptions for premium features, which made annotation visibility rules even more complicated. All Owl Eyes Official annotations are available for free, public viewing, but certain premium educator annotations are restricted to paid subscribers. (Also, students in a classroom taught by a teacher with a paid subscription are considered subscribers, but only within that classroom’s texts!) It was complicated.

We devised a strategy whereby a chain of composable rules could be applied to any set of annotations, to filter them by our business requirements. These rules implemented a simple, identical interface, and each could be passed as an argument to another to form aggregates. The filtered annotation data was then serialized as JSON and emitted onto the page server-side. When the reader renders in the client this data is deserialized and the client-side application script takes over.

The role that a given user possesses in the system often determines if they can see additional meta-data related to annotations, or whether they can perform certain actions on those annotations. These were communicated to the front-end to enable/disable features as needed, and then enforced on the back-end should a clever user attempt to subvert the limitations of his own role. To keep the data footprint as light as possible on the page, we developed a composable serialization scheme that could be applied to any entity in our application. The generic serialization classes break down an entity’s data into a JSON structure, while more specialized serialization classes add or remove data based on a user’s role and permissions. In this way a given annotation might contain meta-data of interest to teachers, but would exclude that meta-data for students. Additional information is added if the user is an administrator, to give them better control over the data on the front-end.

The end result is that, from a user’s perspective, the annotations visible to them, and the data within those annotations, are tailor-made to the user when the eText reader is opened.

Fast-forward to the present day. I have recently been tasked with bringing our eTexts and annotations full circle, back to enotes.com. We brainstormed about the best way to make this happen, as enotes.com lacks the full eText and annotation data, as well as the rich front-end reading experience.

We decided that since the eText and annotation data was already being serialized as JSON for client-side consumption in owleyes.org, it would be trivial to make that same data available via an API. I implemented a simple controller that made use of Symfony’s authentication mechanisms for authenticating signed requests via API key pair, and returned annotation JSON data in the exact same manner that would be used for rendering that data in the eText reader. On inspection, I realized that some of the annotation data wasn’t relevant to what we wanted to display on enotes.com, so I quickly created new serialization classes that made use of existing serialization classes, but plucked unwanted data from their generated JSON structures before returning it. No changes were necessary to the annotation filtering rules, as an API user is, from the ruleset’s perspective, a “public user”, and so would see the same annotation data that users who aren’t logged in on the site would see.

Fetching this data on enotes.com was a simple matter of using PHP’s CURL classes to request data from the owleyes.org endpoint.

The user interface

The eText reader JavaScript code on owleyes.org is complex; it is composed of many different modules – view modules, state modules, utility modules, messaging modules, etc. – that interact together to form a smooth, reading experience. It is far more interactive than the pages we wanted to display on enotes.com, so I initially worried that the code would not be entirely reusable because of its complexity.

I was pleasantly wrong.

When I write software I take great pains to decouple code, favor composition over inheritance, and observe clear, strict, and course API boundaries in my modules and classes. I, as every programmer does, have a particular “style” of programming – the way I think about and model problems – which, in this case, served me very well.

I copied modules from the owleyes.org codebase into the enotes.com codebase that I knew would be necessary for the new eText pages to function. With some minor adjustments (mostly related to DOM element identifiers and classes) the code worked almost flawlessly. Where I needed to introduce new code (we’re using a popup to display annotations in enotes.com, whereas in owleyes.org we use a footer “flyout” that cycles through annotations in a carousel) the APIs in existing code were so well defined that I was able to adapt to them with few issues. Where differing page behavior was desired (e.g., the annotation popup shifts below the annotation when it gets too close to the top of the screen as the reader scrolls, and above otherwise) the decoupled utility modules that track window and page state already provided me with the events and information I needed to painlessly implement those behaviors. And because the schema of the serialized annotation data delivered over the API was identical to the JSON data embedded in the owleyes.org reader, the modules that filtered, sorted, and otherwise manipulated that data did not change at all.

Why it worked

Needless to say, this project left me very satisfied as a developer. When your code is painlessly reused in other contexts it means you’ve something right. I’ve made some observation about what made this reuse possible.

First, reusable code should model a problem, or a system, in such a way that the constituent components of that model can act together, or be used in isolation, without affecting the other parts of the model. Modules, classes, and functions are the tangible building blocks we use to express these models in software, and they should correspond with the way we think about these models in our heads. Each should be named appropriately, corresponding to some concept in the model, and the connections between them should be well understood and obvious. For example, in the eText reader, a tooltip is a highlighted portion of text that may be clicked on to display an annotation popup, which displays annotation information. The tooltip and annotation popup are components in the visual model; they are named appropriately, and the relationship between them is one-way, from tooltip to popup.

Second, a given problem may in fact be composed of multiple models that are being run at the same time. Modules that control the UI are part of the visual or display model; modules that control the access to, and filtering of, data are part of the domain model. Modules track mouse movements, or enable/disable features based on user interaction, are part of the interaction model. Within these models, objects or modules should only perform work that makes sense within the purpose of the model. Objects in the visual model should not apply business rules to data, for example. When one or more objects exhibit behaviors from multiple models, extracting and encapsulating the behavior that is not part of each object’s primary model makes that object more reusable.

Third, objects within a model should have well-defined, coarse APIs. (In the context of objects, an API is an object’s “public” methods to outside callers, or to the objects that extend it.) A coarse API is one that provides the least amount of functionality that its responsibilities require. Yes, the least. An object either stands alone, or makes use of other objects to do its work. If the methods on an object are numerous the object can likely be broken down into several smaller objects to which it will delegate and on which it will depend to do its work internally. Ask: what abstraction does this object represent, and which methods fulfill that abstraction. Likewise the parameters to an object’s methods can often be reduced by passing known state to the object’s constructor (or factory function, or whatever means are used to create the object). This chains the behavior of the object to a predetermined state – all remaining method arguments are only augmentations to this state. If the state needs to change, another object of the same type, with different state, is created and used in its stead. The API is coarse because the methods are few, and their parameters are sparse.

Fourth an object’s state should be stable at all times. Its initial state should be set, completely, through the object’s source of construction (whether by data provided via parameters, or sensible defaults, or both). Properties on objects should be considered read-only, as they represent a “window” into the object’s state. Computed properties should be calculated whenever an object’s relevant internal state changes, usually the result of a method invocation. I avoid exposing objects that can be manipulated by reference through properties; properties are always primatves that can be re-produced or re-calculated, or collections of other “data” objects that have the same characteristics (usually cloned or reduced from some other source). If an object needs to expose information from one of its internal children, I copy that information from the internal source to a primitive property on the external object itself. If the information is itself in the form of an object with multiple properties, I flatten those into individual properties on the external object. The end result is that an object’s state is always generated internally, as a consequence of method invocations, and cannot be manipulated externally, except by way of its public API (methods).

Finally, shared data should exist in “bags” – objects that jealously guard data and only deliver data by value to callers when asked. For example, on owleyes.org a given chapter in Hamlet may contain hundreds of annotations. Annotations may be crated, edited, deleted, and receive replies in client code. The annotation bag is responsible for holding the annotation data and delivering it, in read-only format, to other modules as requested so that they can render themselves (or perform computations) accordingly. When an annotation changes – when an owleyes.org PUT request is sent to the API and a successful response is received – a method on the bag is invoked to update the annotation. Because annotations are only fetched by value, it does no good for the module that initiated the update to directly manipulate the properties on its own annotation object. No other module will receive the change. Instead, the responsible module tells the bag to update the annotation by passing it the new annotation deserialized from the API response. The bag replaces the annotation in its internal collection and then raises an event to notify listening modules that the given annotation has changed. Any module interested in that annotation – or all annotations – then requests the updated data (in read-only format) and re-renders itself (or re-computes its internal state). The bag, then, is the shared resource among modules (not the data, directly) and it is the source of Truth for all data requests.

Epilogue

There is more I could say on the patterns and principles that arose during the execution of this project, but those enumerated above were of the most import and consequence while porting existing code into its new context. Reusable code is not easy to write. It is not automatic. It is the result of thought and discipline that slowly become habit as exercised.

Not all code will be reused; most won’t, in fact. But writing code with a view of extension and reuse in mind can pay off in time and effort in the long run. This is a trade-off, though. The more reusable code tends to be, the more layers of redirection it will possess, necessitating an increase in the number of modules, classes, functions, etc. that need be created. This is a trade-off that can be mitigated by keeping code as simple as possible. Code can be navigated with relative ease if one can reason about it, divining what modules (etc.) do and how they are related through inference.

While I can’t guarantee your experience will be as pleasant as mine, I do believe that if you think about and put these patterns and principles into action you will one day experience the joy of truthfully telling your manager, “oh, that will only take two weeks!” because your diligence produced well-crafted, reusable code.

nicholascloud.com

There is no such thing as strong coffee, only weak people.