Transactions and other details

Chapter 4.9.

Transactions and other details

The database/sql package

As you’re probably starting to realize, the database/sql package essentially provides a standard interface between your Go application and the world of SQL databases.

So long as you use the database/sql package, the Go code you write will generally be portable and will work with any kind of SQL database — whether it’s MySQL, PostgreSQL, SQLite or something else. This means that your application isn’t so tightly coupled to the database that you’re currently using, and the theory is that you can swap databases in the future without re-writing all of your code (driver-specific quirks and SQL implementations aside).

It’s important to note that while database/sql generally does a good job of providing a standard interface for working with SQL databases, there are some idiosyncrasies in the way that different drivers and databases operate. It’s always a good idea to read over the documentation for a new driver to understand any quirks and edge cases before you begin using it.

Verbosity

If you’re coming from Ruby, Python or PHP, the code for querying SQL databases may feel a bit verbose, especially if you’re used to dealing with an abstraction layer or ORM.

But the upside of the verbosity is that our code is non-magical; we can understand and control exactly what is going on. And with a bit of time, you’ll find that the patterns for making SQL queries become familiar and you can copy-and-paste from previous work, or use developer tools like GitHub copilot to write the first draft of the code for you.

If the verbosity really is starting to grate on you, you might want to consider trying the jmoiron/sqlx package. It’s well designed and provides some good extensions that make working with SQL queries quicker and easier. Another, newer, option you may want to consider is the blockloop/scan package.

Managing null values

One thing that Go doesn’t do very well is managing NULL values in database records.

Let’s pretend that the title column in our snippets table contains a NULL value in a particular row. If we queried that row, then rows.Scan() would return the following error because it can’t convert NULL into a string:

sql: Scan error on column index 1: unsupported Scan, storing driver.Value type
&lt;nil&gt; into type *string

Very roughly, the fix for this is to change the field that you’re scanning into from a string to a sql.NullString type. See this gist for a working example.

But, as a rule, the easiest thing to do is simply avoid NULL values altogether. Set NOT NULL constraints on all your database columns, like we have done in this book, along with sensible DEFAULT values as necessary.

Working with transactions

It’s important to realize that calls to Exec(), Query() and QueryRow() can use any connection from the sql.DB pool. Even if you have two calls to Exec() immediately next to each other in your code, there is no guarantee that they will use the same database connection.

Sometimes this isn’t acceptable. For instance, if you lock a table with MySQL’s LOCK TABLES command you must call UNLOCK TABLES on exactly the same connection to avoid a deadlock.

To guarantee that the same connection is used you can wrap multiple statements in a transaction. Here’s the basic pattern:

type ExampleModel struct {
    DB *sql.DB
}

func (m *ExampleModel) ExampleTransaction() error {
    // Calling the Begin() method on the connection pool creates a new sql.Tx
    // object, which represents the in-progress database transaction.
    tx, err := m.DB.Begin()
    if err != nil {
        return err
    }

    // Defer a call to tx.Rollback() to ensure it is always called before the 
    // function returns. If the transaction succeeds it will be already be 
    // committed by the time tx.Rollback() is called, making tx.Rollback() a 
    // no-op. Otherwise, in the event of an error, tx.Rollback() will rollback 
    // the changes before the function returns.
    defer tx.Rollback()

    // Call Exec() on the transaction, passing in your statement and any
    // parameters. It's important to notice that tx.Exec() is called on the
    // transaction object just created, NOT the connection pool. Although we're
    // using tx.Exec() here you can also use tx.Query() and tx.QueryRow() in
    // exactly the same way.
    _, err = tx.Exec("INSERT INTO ...")
    if err != nil {
        return err
    }

    // Carry out another transaction in exactly the same way.
    _, err = tx.Exec("UPDATE ...")
    if err != nil {
        return err
    }

    // If there are no errors, the statements in the transaction can be committed
    // to the database with the tx.Commit() method. 
    err = tx.Commit()
    return err
}

Transactions are also super-useful if you want to execute multiple SQL statements as a single atomic action. So long as you use the tx.Rollback() method in the event of any errors, the transaction ensures that either:

All statements are executed successfully; or
No statements are executed and the database remains unchanged.

Prepared statements

As I mentioned earlier, the Exec(), Query() and QueryRow() methods all use prepared statements behind the scenes to help prevent SQL injection attacks. They set up a prepared statement on the database connection, run it with the parameters provided, and then close the prepared statement.

This might feel rather inefficient because we are creating and recreating the same prepared statements every single time.

In theory, a better approach could be to make use of the DB.Prepare() method to create our own prepared statement once, and reuse that instead. This is particularly true for complex SQL statements (e.g. those which have multiple JOINS) and are repeated very often (e.g. a bulk insert of tens of thousands of records). In these instances, the cost of re-preparing statements may have a noticeable effect on run time.

Here’s the basic pattern for using your own prepared statement in a web application:

// We need somewhere to store the prepared statement for the lifetime of our
// web application. A neat way is to embed it in the model alongside the 
// connection pool.
type ExampleModel struct {
    DB         *sql.DB
    InsertStmt *sql.Stmt
}

// Create a constructor for the model, in which we set up the prepared
// statement.
func NewExampleModel(db *sql.DB) (*ExampleModel, error) {
    // Use the Prepare method to create a new prepared statement for the
    // current connection pool. This returns a sql.Stmt object which represents
    // the prepared statement.
    insertStmt, err := db.Prepare("INSERT INTO ...")
    if err != nil {
        return nil, err
    }

    // Store it in our ExampleModel struct, alongside the connection pool.
    return &ExampleModel{DB: db, InsertStmt: insertStmt}, nil
}

// Any methods implemented against the ExampleModel struct will have access to
// the prepared statement.
func (m *ExampleModel) Insert(args...) error {
    // We then need to call Exec directly against the prepared statement, rather
    // than against the connection pool. Prepared statements also support the
    // Query and QueryRow methods.
    _, err := m.InsertStmt.Exec(args...)

    return err
}

// In the web application's main function we will need to initialize a new
// ExampleModel struct using the constructor function.
func main() {
    db, err := sql.Open(...)
    if err != nil {
        logger.Error(err.Error())
        os.Exit(1)
    }
    defer db.Close()

    // Use the constructor function to create a new ExampleModel struct.
    exampleModel, err := NewExampleModel(db)
    if err != nil {
        logger.Error(err.Error())
        os.Exit(1)
    }

    // Defer a call to Close() on the prepared statement to ensure that it is
    // properly closed before our main function terminates.
    defer exampleModel.InsertStmt.Close()
}

There are a few things to be wary of though.

Prepared statements exist on database connections. So, because Go uses a pool of many database connections, what actually happens is that the first time a prepared statement (i.e. the sql.Stmt object) is used it gets created on a particular database connection. The sql.Stmt object then remembers which connection in the pool was used. The next time, the sql.Stmt object will attempt to use the same database connection again. If that connection is closed or in use (i.e. not idle) the statement will be re-prepared on another connection.

Under heavy load, it’s possible that a large number of prepared statements will be created on multiple connections. This can lead to statements being prepared and re-prepared more often than you would expect — or even running into server-side limits on the number of statements (in MySQL the default maximum is 16,382 prepared statements).

The code is also more complicated than not using prepared statements.

So, there is a trade-off to be made between performance and complexity. As with anything, you should measure the actual performance benefit of implementing your own prepared statements to determine if it’s worth doing. For most cases, I would suggest that using the regular Query(), QueryRow() and Exec() methods — without preparing statements yourself — is a reasonable starting point.