CSVify

4 min read
31-Jan-21

Our React application currently uses a home-rolled tool to export AntD tables to CSV (aptly named csvify). This method is quite simple, and met all our requirements for years. We don’t tend to do much exporting to CSV, nor are our exports particularly complex.

This changed when a client required us to handle their data differently, such that they have a comma-separated list, with composite elements that have commas. Simple, right? As it turns out, CSVs are pretty much solved.

The question, then: do we make a small tweak to our existing method and move on OR do we replace the existing solution with a better external library?

What's Actually Happening?

  1. Our current solution for exporting AntD tables to CSV is homegrown. As such, it doesn’t have all the amenities present in an external library.
1const csvify = val => {
2 if ([undefined, null].includes(val)) return '';
3
4 let newVal = val.toString().replace(/,/g, '');
5
6 newVal = newVal.toString().replace(/"/g, "'");
7
8 return newVal;
9};
  1. A client required us to export some content that contained elements with commas. These values might look like “A B, C, D E”. (Don’t ask me why they want space-delimited entries; trust me, I asked).
  2. Clearly, the CSVify method that worked for our other use cases won't work now. Stripping our content of commas is the opposite of what the client wants. Despite this fact, we need to prevent breaking any other use of this API which means either a delicate touch or a lot of testing is called for.

Considerations

  1. Standard CSV requires quote delimiters around content containing commas. Our current method doesn’t follow that standard, as it replaces all commas indiscriminately.
  2. CSV management is a solved, non-trivial problem. Modifications to our method for this use-case can’t touch on everything that might come up down the road, as there are a lot of potential issues.
  3. If we make a modification, it can’t break anything on prod. This should go without saying, but is worth reiterating because we’re looking at solutions that go beyond what is strictly “necessary” into more “nice upgrade” territory.

The Options

  1. The lightest, cheapest solution would likely be passing in a flag to the csvify method that defaults to false, called something like shouldFormat. This has the downside of increasing complexity if there are only pieces of the content we want formatted later (i.e. do we add shouldFormatCommas and shouldFormatQuotes?). The upside is, of course, we avoid adding external dependencies, don’t break anything on prod, and only need to make the change in one spot where it’s necessary. It’s also reusable for future developers who need to accomplish the exact same thing.

  2. A step up in complexity would be going further into the already-explored waters of CSV management, by expanding this csvify API to check for escape characters or quotes. If a string has double quotes, accept whatever is contained within as a string literal. If a non-numeric value contains e.g. \ or ~ (you’re welcome, powershell users), accept the next character as its char literal value.

    This solution has tremendous downside for only marginal gains, so was quickly dismissed. The obvious issues (communicating expectations of the API to consumers, not behaving consistent with other CSV management utils, spending a more significant amount of resources reinventing the wheel, etc.) far outweigh any benefits over the first solution.

  3. The last option considered (and our eventual choice) would be the most robust, but in my opinion comes with the some hidden risk. I mentioned several times earlier CSV is solved. I genuinely believe that; there might be a time and place to roll our own solutions, but the ease with which I can find a great npm package to do this for me that will automatically handle all my other concerns later makes it a compelling option.

    So what’s the downside? Ultimately, it’s up to each developer to set their tolerance for external dependencies. I think it’s a consideration that doesn’t get made often enough, but every new dependency added to a project provides one new point something can fail. What if you choose a dependency and it stops being maintained? Or you choose one and find it doesn’t quite do what you need, but another one does? Do you revisit all its uses in your code base to refactor to the bigger and better solution every time one appears?

    It also presents risk because, again, this already behaves as expected in production environments. There is risk in changing tools customers use regularly - no matter how small the tweak - especially when the alternative would be almost guaranteed to go unnoticed.

Sources

  1. Our library du jour: https://www.npmjs.com/package/export-to-csv

Previous
Atomic Database Repairs

Next
Consistency vs. Continual Improvement