4 Hours to Build a Word Editor: My Take on the 'Surgical' Tech Assessment

December 5, 2025
4 min read
By Ali Shan

Table of Contents

This is a list of all the sections in this post. Click on any of them to jump to that section.

4 Hours to Build a Word Editor: My Take on the “Surgical” Tech Assessment

Most tech interviews feel pretty standard. Reverse a list, center a div, explain a protocol, those kinds of tasks. You practice, you get used to the pattern, you move on.

But yesterday I got something completely different.

I opened an email with a simple line and a 4 hour timer. The task was to build a command line tool that could surgically edit Microsoft Word documents. As in, work directly with the inner XML, find elements by ID, and insert new paragraphs in the middle of the document without breaking anything.

Not a LeetCode problem. A real world engineering constraint. And four hours on the clock.

Understanding the Beast

A .docx file is not a text file. Rename it to .zip and open it. You get a bunch of XML, images, and relationship maps. It is more of a mini file system than a single document.

The prompt said something like, we are not looking for perfection, we are looking for someone who understands the problem deeply.

That line shaped my whole approach. I could have tried to write my own XML parser, handle namespaces, and manage the zip structure myself, but that would have been a terrible idea with the time limit.

I needed to stand on the shoulders of existing tools so I could focus on the actual hard part.

The Build vs Buy Choice

I went with Python and python-docx. It is a solid library but it is mainly built for creating docs from scratch. It does not give you much control over the exact position where you insert elements. It is very much focused on adding content at the end.

The assessment specifically needed an INSERT_AFTER command. Meaning I had to inject a paragraph right in the middle.

The library did not support that.

This was the real challenge.

Becoming an OXML Surgeon

So I built something I called oxml-surgeon.

I stopped relying on the high level API and dropped straight into the raw XML tree by accessing the ._element. At that point, you are basically working in DOM land. I located the parent node, figured out the index of the target element, and inserted a new node exactly where it needed to go.

It felt like surgery. One incorrect tag and the whole document breaks.

Adding Some Soul

The code worked, but the CLI felt cold. Very robotic. The reviewers are humans and they are going to read the output, so I spent the last half hour making it feel friendlier.

I added a small color helper so successes show in green and errors in red. I added tiny quality of life things like allowing ls as an alias for map and rm for delete.

I kept the comments honest. Not AI perfect, just real. For example:

TODO

Table parsing is very basic right now. I am only counting rows because parsing nested cells in OXML is way too much for a 4 hour task.

It shows the tradeoffs instead of hiding them.

The Result

It shows the tradeoffs instead of hiding them.

By the 3 and a half hour mark, I had a CLI that could load a doc, map its structure to JSON, replace text while keeping bold and italic formatting, and insert new paragraphs anywhere inside the file.

I ran the validator. PASS.

This challenge reminded me why I enjoy building things. It was not about memorizing algorithms. It was about understanding the data structure in front of me, handling the time pressure, and knowing when to rely on a library and when to work directly with the raw XML.

It was not perfect. Table parsing was basic and the ID system was simple. But it solved the real problem in a focused and controlled way, and sometimes that is exactly what engineering is about.