2025-10-25
Sqlc is the darling of the Go community when it comes to interfacing with databases, and for good reason. It gives us type-safety, compile-time verification and LSP support for our queries, does away with the runtime overhead of an ORM, and, of course, provides transparency on what SQL is being executed. It's the SQL you wrote yourself wrapped in a friendly little Go function. On the other hand, it's natural for all those great boons to come with some, let's say inconveniences. Or maybe dilemmas? Navigating those in an attempt to establish usage patterns that make sense is largely what's at the crux of this post.
Just a little preface first. I'm not someone with multiple years of professional experience in writing Go services. I'm mostly documenting my process of figuring things out, introspecting aloud and hoping that the conclusions I'm reaching are pragmatic. If they are, they'll be hopefully approaching the well established idioms that most seasoned Go developers are probably settled into already. If you find anything insightful, interesting or useful in this article, that's great. If you find something wrong or inefficient with my reasoning, even better! Please let me know! It's the main reason I'm putting all this out here in the first place.
Data inputted by the user through, say, a POST or a PATCH request has to go through a pilgrimage of sorts to reach the database. A transformative experience during which it will struggle to keep its identity intact while being morphed into various incarnations of itself. The application needs different representations of the same data for different purposes. A fundamental tension exists between those because each representation has its own constrains, its own goals to fulfill, and its own opinion about what the data means.
The user delivers a JSON that will have to be unmarshalled into native Go types, and those, in turn, will have to fit into pgtypes, which is what pgx uses. From there sqlc will get custody of the data and it will do its thing. Pgtypes, though, are not just type aliases. They are structs that wrap values with validity metadata.
Validity information is important for fields that can be omitted or explicitly set to their zero-values because, in order to be able to implement partial updates, we need to know which one of the two it was.
JSON has an elegant way of handling optional values: you can simply omit them. Go doesn't natively have optional types, but it has pointers and they can work as a stand-in.
If we unmarshal a JSON into a struct of native Go types we will get a zero-value, for example an empty string "", both in the case that the field was omitted and in the case that it was explicitly set to a zero-value. That's a problem because those two situations represent two very different intentions by the user and we have no way of telling them apart.
Contrarily, by unmarshalling into structs with pointer types, if a user omits a field we will get a nil pointer for that field, whereas if they submit an empty string we will be pointed to an empty string and their intention will, now, be apparent to us.
This distinction enables us to implement partial updates without the need to move away from Go's native types. The pointer's ability to be in a nil state makes it perfect for representing presence vs absence.
So, this is all great. We now have a straightforward way of getting the data into our Go program in a form that makes it easy for us to work with it, validate it, run business logic on it, and do all sorts of programmer stuff with it. But we still have to get it into the database. And it won't be on a Go pointer. In most cases it will be on a pgtype struct.
As we've covered already, pgtypes with their Valid field are pgx's, and in extension sqlc's, way of dealing with the issue of "was this expressly set to a zero-value or was it omitted?". So, we need to take our pointer and shove it into a pgtype, making sure we imprint the pointer's nil-or-not state into the Valid field correctly.
This ought to work fine. Although, we'll have to make a whole bunch of those converter functions, one for every type we need to support. And we'll have to call them for every field of our struct every time we want to insert or update something in the database. And we'll probably have to double all that, converting the opposite way, for the return trip. Seems like a bit too much boilerplate popped out of nowhere all of a sudden. That can't be right.
The more I look at it the worse it gets.
DISCLAIMER: The following code blocks are here just to drive the point home, you don't have to delve into them much.
Yeah, I sat down and wrote it by hand. I was very motivated to have something that works at hand before affording myself the guilty pleasure of chasing down the "optimal" solution.
I know that DRY is not to be taken too seriously, but this code is trying to put as much distance between itself and DRY as it can. It's doing its best to meet the definition of WET (Write Everything Twice - Thrice even).
The perfectionist in me, the guy that wants, needs things to be idiomatic, compliant, and future-proof and every decision to be optimal, he's very anxious. And he's pestering me. "There has to be some obvious solution to this. Writing monstrous conversion functions like these can't be the standard. It's just the naive first attempt. This is definitely a solved problem, it has to be."
I listen to this guy a lot, so I go looking.
My quest quickly led me to a few potential holy grails.
Go's reflect package can derive and manipulate types at runtime. The encoding/json package in the Go standard library makes use of it for marshalling & unmarshalling. Maybe I could write a universal converter to map fields automatically with it?
Thankfully we don't have to look too closely at it to realise it's not a very good fit. The promise is very tempting but there are several dissonances with the project's goals. The reason I went with sqlc in the first place was to gain explicitness and compile-time safety while ridding myself of any potential for runtime overheads. Using reflection would be going back on every single one of these points. Having things conveniently happen out of sight is nice and all but having them laid out clear as day in front of you is far more empowering, both for figuring stuff out and for debugging. Also, the encoding/json package's marshalling/unmarshalling is not particularly known for its speed and efficiency if that's any hint about the runtime overhead (although, at the time of writing, a json/v2 that is trying to address a lot of the issues is in public beta).
I guess I would feel better about using it if my data was truly unstructured and I had to wait until runtime to figure out what I'm dealing with, but in this case I know exactly what my types are going to be at compile-time. Deferring the deduction to runtime just to avoid boilerplate seems like an overcorrection.
Anyway, if I was eager to make those trade-offs I would have just used an ORM from the start so, without going much deeper into it, I bid reflection farewell and moved on to the next candidate.
At a glance generics look like they're made exactly for solving this kind of problem. They are a new-ish feature of the language, added in Go 1.18 in 2022. They are a compile-time feature, which is great, and they don't really have any of reflection's characteristics that I considered drawbacks.
Generics are here to enable code reusability while maintaining type-safety. If you have code that is repetitive only because it needs to accommodate different types, even though the type itself might not really matter for the logic of that code, generics give you a way to write it just once and have it work for all of them.
The mental model seems straightforward enough: I have a pointer to some type T that could represent anything, and I want to convert it to some pgtype P. If the pointer is nil I return an pgtype with the Valid field set to false. Otherwise I extract the value from it and wrap it in a valid pgtype. Theoretically it would look something like this:
Alas, it couldn't be as simple as this. Go's generics work through type constraints. They can only be used when types share behavior or structure in a way that can be expressed as an interface. The problem is that we are returning pgtypes and, no matter how much they'd be fit for it, they don't conform to a common interface that we could use as a constraint on P. Each pgtype is its own distinct struct. They all have a Valid field. That's it. The field that holds the value has a different name and a different type for each of them. So, we can't just say "P must be a pgtype".
Could we provide them with a common interface ourselves? Yes. For that we would have to write wrapper types that implement that interface for all of them. And we would still have to write individual conversion functions from the wrapper types to the actual pgtypes that sqlc expects, and we're doing all that already - just with the pointers instead of wrappers.
Worst of all, we would still need to manually map the fields from our input structs to the wrapper types. That, too, is the exact same work we're already doing. We've just created a convoluted middleman.
The issue is that our original code, although it is indeed repetitive, requires that we handle every individual type differently meaning generics could not have been a good fit. The type itself does matter for the logic of our code.
So, in the end we would have only made things worse, adding layers just to move boilerplate around. Truly an exercise in futility, if even that.
Code generation is another way of achieving code reusability while maintaining type-safety. It has a history of being used by developers as a stand-in for generics until the introduction of the latter. When you can't ignore types, instead of writing repetitive code for multiple types, you write code that writes code for multiple types.
Unlike reflection-based approaches that work at runtime, code generation produces ordinary Go code that gets compiled normally, benefits from the same compiler checks and optimizations as hand-written code, shows up in stack traces when debugging, and, of course, can be inspected directly.
Also, unlike generics, code generation can actually deal with each different type explicitly. It just gives the developer the opportunity to avoid typing all the repetitive but type-bound code by hand. So, in our case, we would end up with the same amount of repetitive code plus the code generator instructions and templates and whatnot, but we wouldn't have to write everything ourselves anymore. We would only have to provide the latter to be graced with the former.
I am already using code generation extensively in this project thanks to sqlc which generates all my databsae access code from my hand-written SQL queries. Similarly, other tools like Goverter can generate conversion functions from struct definitions or from annotations on struct fields.
In fact, it seems Goverter might have been designed with our specific use case in mind.
Skimming through the documentation several features catch my eye. It looks like Goverter can automatically dereference pointers when mapping fields. It even has a setting to control how it should behave if it comes across a nil pointer. useZeroValueOnPointerInconsistency essentially dictates if Goverter should just drop a zero value at the destination variable when encountering a nil pointer. It also looks like it can handle fields from nested or embedded structs to a certain extent.
As I showed you above, I have two distinct repetitive conversion patterns that I'd ideally like to eliminate - avoid - automate - whatever. The one is a struct-to-struct conversion of the data that was unmarshalled from JSON into pointers being converted into the structs that the sqlc-generated functions expect.
The other one is type conversions from those pointer types to various pgtypes, which are utilized in the first ones.
If Goverter cannot figure out how it's meant to convert from one specific type to another it can be fed custom converters for those specific situations. I don't see an obvious way to correlate the pointer's nil-or-not state to the Valid field of pgtypes using Goverter's native functionality so we might have to keep the type converters from above and use those to enable the tool to generate the struct-to-struct mapping boilerplate. At least we can be fairly certain one of the two repetitive patterns will be taken care of.
The type converters are not that big of an issue anyways. I will need to write two of them, a something-to-pgtype and a pgtype-to-something, whenever I decide I have to utilize a new pgtype in my project. Copy-pasting from the existing ones and adapting that to a new type, every once in a blue moon, is not that big of a deal.
Consider the struct converters though. If I decide to add, say, a new table to my database and expose it through the API, I will have to write a whole new set of conversion functions for it. The thick ones. Those get thicker with the amount of fields on the struct and they get more numerous with the amount of entities that I want to represent. That is the kind of repetition that I would very much like to automate away.
Goverter might be able to help us with the most important half of our repetition problem then. Should I adopt it right away? The code that it will generate will probably be very similar to what I have already written by hand. I don't see a reason to remove that just to regenerate the same thing through a tool just to introduce it to my project. But if I were to expand the API (which I am planning to) and add more fields, or more importantly, more entities, I would be hard-pressed to justify writing more repetitive code by hand.
All in all, I think code generation seems like the best fit for what I'm trying to achieve here.
There's one more consideration worth going over. A code generation tool like Goverter would make it considerably easier to carve out a domain layer in my application: create data transfer objects for JSON serialization/deserialization, pure Go structs with no external dependencies to form the independent domain layer for the business logic to operate on, and, finally, the structs that sqlc generates for the database layer. Goverter could trivialize the issue of mapping between all those extra data representations which would have been a significant hurdle before.
What would I gain by adding a domain layer to my application? Certainly a good deal of ease and convenience if I decide to expand the project's scope. I could swap out databases, I could add GraphQL, adopt gRPC & protobuf, turn it into a fully-fledged Kafka event producer/consumer microservice even, without having to rewrite the core business logic. The domain would remain pristine and unaware of how data flows in and out of the system. The business logic could be individually unit tested without having to mock database calls or HTTP requests. Good things all of them.
On the other hand, If I'm building a relatively simple web service with a well-defined scope that is not planned to expand, the YAGNI (You-Aren't-Gonna-Need-It) principle suggests that I shouldn't overengineer for hypothetical future requirements.
With Goverter now in the picture adding a domain layer does not look that tedious anymore. What seemed like overengineering, manually creating, maintaining and, as time goes by, potentially having to massively expand all those conversions between layers, now becomes much more feasible. The main barrier of having to hand-write hundreds of lines of conversions is removed.
After exploring reflection, generics, code generation and weighing the pros and cons of adding a domain layer to my application, I found myself surprisingly close to where I started. The big heavy-hitting solutions didn't seem to fit the problem as well as I initially hoped.
Runtime reflection is a great tool for when you don't know the form of your data beforehand and you need to discover it at runtime. If that is not your situation though, if you know your types at compile time then you need to think hard about the tradeoffs you are making. Maybe they are worth it to you, maybe not. In my case, I adopted sqlc specifically to gain type-safety and avoid runtime overheads. Reflection would have been going back on both those points just to avoid writing repetitive code. The question I need to be asking is not whether reflection can solve my code repetition problem. It can. The question, and I know it sounds cheesy - sorry, is at what cost.
Generics solve problems where types are interchangeable, where the type itself is not a real part of the problem space but just an implementation detail. If you want to split a slice, generics are perfect. The contents of the slice shouldn't matter, they just happen to be there and the splitting logic is identical regardless of whether the slice contains strings or integers or structs or anything else. In my situation the types are not interchangeable though. Each pgtype struct has different field names and different value types. The conversion from *string to pgtype.Text is fundamentally different from *int32 to pgtype.Int4.
In the end, code generation through a tool like Goverter technically does not rid me of a single line of code, but it does relieve me from the burden of writing it all by hand. From now on at least. Still, with what I've typed so far, if I wasn't planning on expanding the scope of the project I wouldn't gain anything by adopting it. It turns out the code I hand-typed is as good as it gets. All those repetitive conversion functions are apparently necessary and Goverter would have just generated something very similar in their stead.
It seems the repetitive code might be the right code after all, much to the perfectionist's dismay. All those typed and repetitive converter functions are largely necessary, but they are also clear and explicit. They solve the problem without ceremony and they're trivial to understand and modify. They do have a few things going for them and they'll do the job well enough, and when I need more of them I can always automate it.
Sometimes the best optimization is accepting that there is no optimization to be made.
Comments
The comments are powered by Mastodon.
You can leave a comment by replying to this post . You will need an account on a Mastodon or other compatible Fediverse instance to do so.
Something else that looked promising on the surface was to write custom JSON marshalling/unmarshalling logic so the JSON could get dumped straight into pgtypes and viceversa but then we'd have to deal with pgtypes in the validation and business logic of our application.