Thursday, September 3, 2015

reducing duplicate SQL in Go project

Recently I started writing reports for a client of mine. At first I thought it was going to be just a few reports but over time a number of things have happened. (a) more and more reports (b) even more reports (c) I'm getting lazy so my tools are starting to scale (d) I'm getting lazier and I also have a need to reuse code without cut, copy, paste. (e) the need to share code that the reports generate similar results when based on the same foundation.

While part of the implementation means using CTEs (common template expression) it's not the whole story as I implemented a complete reporting engine that exports to CSV, TSV, text tables, XSLX, JSON, DOT, go templates, and supports it's own DSL including loops and dynamic queries.

In my current implementation I store the SQL in bash, yes bash, shell scripts that export the SQL names like this:

export hello_cte="hello_cte (hello) as (select 'HELLO')"
export hello=";with ${hello_cte} select * from hello"

In one case I need to pass variables into the SQL:

export hello_cte="hello_cte (hello) as (select 'HELLO')"
declare Hello_${name}=";with ${hello_cte} select *, '${name}' from hello"

Then the SQL get's baked into the go code using my envtmpl project. envtmpl works with go's generate action to execute a template and replace the template variables with data from the environment... the SQL. (in the example above the ${name} is expanded by bash not go templates)

NOTE: top level queries are named with a leading uppercase letter and use camel case.

This framework works well for me as I have over 50 reports, however, there are two/three downsides. (1) because I'm using bash variables the SQL statements are losing their formatting [bash's multi-line strings are kludgy at best] (2) any benefit from a modern text editor with syntax highlighting is impossible even though some editors support multiple languages. (3) the interaction between bash and go is so tightly coupled it's going to make decoupling a challenge.

As an aside I've been looking for a SQL reformatter that is written or can be embedded into a go project. I'm not going to use a 3rd party service for the very obvious reasons.

It was the search for a reformatter that offered a glint into the future. I found dotsql. The dotsql project is essentially a librarian and an execution wrapper around go's sql. Give the Load() method the name of a file, reader, or string and it will be parsed into a map[string]string. Then the SQL can be accessed by Prepare, Exec, etc... however, there are two interesting methods. Raw() and QueryMap().  With these two methods I can use my CTE strategy above and get all the benefits I was hoping for as I described my problem space.

var (
        doc = `
-- name: hello_cte
hello_cte (hello) as (select 'HELLO')

-- name: hello
;with
{{.hello_cte}}
select * from hello_cte
        `
)

Notice the embedded {{.hello_cte}} in SQL.

Here is the complete source:

package main

import (
        "bytes"
        "log"
        "text/template"

        "github.com/gchaincl/dotsql"
)

var (
        doc = `
-- name: hello_cte
hello_cte (hello) as (select 'HELLO')
-- name: hello
;with
{{.hello_cte}}
select * from hello_cte
        `
)

func main() {
        d, err := dotsql.LoadFromString(doc)
        if err != nil {
                log.Printf("ERROR: %v", err)
        }
        sql, err := d.Raw("hello")
        if err != nil {
                log.Printf("ERROR: %v", err)
        }
        tmpl := template.Must(template.New("dotsql").Parse(sql))
        buf := bytes.NewBufferString("")
        err = tmpl.Execute(buf, d.QueryMap())
        if err != nil {
                log.Printf("ERROR: %v", err)
        }
        log.Printf("Finished:\n%s\n", buf)
}

And this is the output:

2015/09/03 11:56:12 Finished:
;with
hello_cte (hello) as (select 'HELLO')
select * from hello_cte

I still have to validate the dotsql behavior against my DSL but I'm confident it'll work since the '-- name:' seems to be a token used to separate the SQL statements from each other and my DSL uses '----[]----' for a similar but different purpose.

And eventually I'll also use go-bindata in order to embed the complete SQL string into the executable.

CRAP! Just as I wrote that last sentence I realized I could have accomplished the (almost) same thing with go-bindata. That exercise is left to the reader. dotsql has one advantage and that is more than one SQL statement in each file or a single master file. Where go-bindata requires a single file per SQL statement.

No comments:

Post a Comment

another bad day for open source

One of the hallmarks of a good open source project is just how complicated it is to install, configure and maintain. Happily gitlab and the ...