Transactions and other details
The database/sql package
As you’re probably starting to realize, the database/sql
package essentially provides a standard interface between your Go application and the world of SQL databases.
So long as you use the database/sql
package, the Go code you write will generally be portable and will work with any kind of SQL database — whether it’s MySQL, PostgreSQL, SQLite or something else. This means that your application isn’t so tightly coupled to the database that you’re currently using, and the theory is that you can swap databases in the future without re-writing all of your code (driver-specific quirks and SQL implementations aside).
It’s important to note that while database/sql
generally does a good job of providing a standard interface for working with SQL databases, there are some idiosyncrasies in the way that different drivers and databases operate. It’s always a good idea to read over the documentation for a new driver to understand any quirks and edge cases before you begin using it.
Verbosity
If you’re coming from Ruby, Python or PHP, the code for querying SQL databases may feel a bit verbose, especially if you’re used to dealing with an abstraction layer or ORM.
But the upside of the verbosity is that our code is non-magical; we can understand and control exactly what is going on. And with a bit of time, you’ll find that the patterns for making SQL queries become familiar and you can copy-and-paste from previous work, or use developer tools like GitHub copilot to write the first draft of the code for you.
If the verbosity really is starting to grate on you, you might want to consider trying the jmoiron/sqlx
package. It’s well designed and provides some good extensions that make working with SQL queries quicker and easier. Another, newer, option you may want to consider is the blockloop/scan
package.
Managing null values
One thing that Go doesn’t do very well is managing NULL
values in database records.
Let’s pretend that the title
column in our snippets
table contains a NULL
value in a particular row. If we queried that row, then rows.Scan()
would return the following error because it can’t convert NULL
into a string:
sql: Scan error on column index 1: unsupported Scan, storing driver.Value type <nil> into type *string
Very roughly, the fix for this is to change the field that you’re scanning into from a string
to a sql.NullString
type. See this gist for a working example.
But, as a rule, the easiest thing to do is simply avoid NULL
values altogether. Set NOT NULL
constraints on all your database columns, like we have done in this book, along with sensible DEFAULT
values as necessary.
Working with transactions
It’s important to realize that calls to Exec()
, Query()
and QueryRow()
can use any connection from the sql.DB
pool. Even if you have two calls to Exec()
immediately next to each other in your code, there is no guarantee that they will use the same database connection.
Sometimes this isn’t acceptable. For instance, if you lock a table with MySQL’s LOCK TABLES
command you must call UNLOCK TABLES
on exactly the same connection to avoid a deadlock.
To guarantee that the same connection is used you can wrap multiple statements in a transaction. Here’s the basic pattern:
type ExampleModel struct { DB *sql.DB } func (m *ExampleModel) ExampleTransaction() error { // Calling the Begin() method on the connection pool creates a new sql.Tx // object, which represents the in-progress database transaction. tx, err := m.DB.Begin() if err != nil { return err } // Defer a call to tx.Rollback() to ensure it is always called before the // function returns. If the transaction succeeds it will be already be // committed by the time tx.Rollback() is called, making tx.Rollback() a // no-op. Otherwise, in the event of an error, tx.Rollback() will rollback // the changes before the function returns. defer tx.Rollback() // Call Exec() on the transaction, passing in your statement and any // parameters. It's important to notice that tx.Exec() is called on the // transaction object just created, NOT the connection pool. Although we're // using tx.Exec() here you can also use tx.Query() and tx.QueryRow() in // exactly the same way. _, err = tx.Exec("INSERT INTO ...") if err != nil { return err } // Carry out another transaction in exactly the same way. _, err = tx.Exec("UPDATE ...") if err != nil { return err } // If there are no errors, the statements in the transaction can be committed // to the database with the tx.Commit() method. err = tx.Commit() return err }
Transactions are also super-useful if you want to execute multiple SQL statements as a
single atomic action. So long as you use the tx.Rollback()
method in the event of any errors, the transaction ensures that either:
- All statements are executed successfully; or
- No statements are executed and the database remains unchanged.
Prepared statements
As I mentioned earlier, the Exec()
, Query()
and QueryRow()
methods all use prepared statements behind the scenes to help prevent SQL injection attacks. They set up a prepared statement on the database connection, run it with the parameters provided, and then close the prepared statement.
This might feel rather inefficient because we are creating and recreating the same prepared statements every single time.
In theory, a better approach could be to make use of the DB.Prepare()
method to create our own prepared statement once, and reuse that instead. This is particularly true for complex SQL statements (e.g. those which have multiple JOINS) and are repeated very often (e.g. a bulk insert of tens of thousands of records). In these instances, the cost of re-preparing statements may have a noticeable effect on run time.
Here’s the basic pattern for using your own prepared statement in a web application:
// We need somewhere to store the prepared statement for the lifetime of our // web application. A neat way is to embed it in the model alongside the // connection pool. type ExampleModel struct { DB *sql.DB InsertStmt *sql.Stmt } // Create a constructor for the model, in which we set up the prepared // statement. func NewExampleModel(db *sql.DB) (*ExampleModel, error) { // Use the Prepare method to create a new prepared statement for the // current connection pool. This returns a sql.Stmt object which represents // the prepared statement. insertStmt, err := db.Prepare("INSERT INTO ...") if err != nil { return nil, err } // Store it in our ExampleModel struct, alongside the connection pool. return &ExampleModel{DB: db, InsertStmt: insertStmt}, nil } // Any methods implemented against the ExampleModel struct will have access to // the prepared statement. func (m *ExampleModel) Insert(args...) error { // We then need to call Exec directly against the prepared statement, rather // than against the connection pool. Prepared statements also support the // Query and QueryRow methods. _, err := m.InsertStmt.Exec(args...) return err } // In the web application's main function we will need to initialize a new // ExampleModel struct using the constructor function. func main() { db, err := sql.Open(...) if err != nil { logger.Error(err.Error()) os.Exit(1) } defer db.Close() // Use the constructor function to create a new ExampleModel struct. exampleModel, err := NewExampleModel(db) if err != nil { logger.Error(err.Error()) os.Exit(1) } // Defer a call to Close() on the prepared statement to ensure that it is // properly closed before our main function terminates. defer exampleModel.InsertStmt.Close() }
There are a few things to be wary of though.
Prepared statements exist on database connections. So, because Go uses a pool of many database connections, what actually happens is that the first time a prepared statement (i.e. the sql.Stmt
object) is used it gets created on a particular database connection. The sql.Stmt
object then remembers which connection in the pool was used. The next time, the sql.Stmt
object will attempt to use the same database connection again. If that connection is closed or in use (i.e. not idle) the statement will be re-prepared on another connection.
Under heavy load, it’s possible that a large number of prepared statements will be created on multiple connections. This can lead to statements being prepared and re-prepared more often than you would expect — or even running into server-side limits on the number of statements (in MySQL the default maximum is 16,382 prepared statements).
The code is also more complicated than not using prepared statements.
So, there is a trade-off to be made between performance and complexity. As with anything, you should measure the actual performance benefit of implementing your own prepared statements to determine if it’s worth doing. For most cases, I would suggest that using the regular Query()
, QueryRow()
and Exec()
methods — without preparing statements yourself — is a reasonable starting point.