Go: Easy to Read, Hard to Compile
Corner cases when compiling Go
Ian Lance Taylor
Ian Lance Taylor
This talk is based on Go compiler bugs encountered over the years.
2
Names in Go packages are defined in the entire package, so Go types
can refer to themselves recursively.
type P *P type S []S type C chan C type M map[int]M
This is not permitted in C/C++, except for the special case of a
struct/union/class field which is a pointer/reference.
All Go compiler code that walks over types has to be careful to avoid
endless loops.
What good is a recursive pointer type? It can only be nil or a
pointer to itself. That's enough for Peano arithmetic.
func Val(p *P) int { if p == nil { return 0 } else { return 1 + Val(*p) } } func Add(a, b *P) *P { if b == nil { return a } else { a1 := new(P) *a1 = a // a1 == a + 1 return Add(a1, *b) // a + b == Add(a+1, b-1) } }
Actually, a recursive pointer can have a bit more information: it can
have a finalizer.
// +build ignore
package main
import (
"fmt"
"runtime"
)
// 1 START OMIT
type P *P
type S []S
type C chan C
type M map[int]M
// 1 END OMIT
// 2 START OMIT
func Val(p *P) int {
if p == nil {
return 0
} else {
return 1 + Val(*p)
}
}
func Add(a, b *P) *P {
if b == nil {
return a
} else {
a1 := new(P)
*a1 = a // a1 == a + 1
return Add(a1, *b) // a + b == Add(a+1, b-1)
}
}
// 2 END OMIT
func Print(p *P) { fmt.Println(Val(p)) } func Allocate() { p := new(P); *p = new(P); **p = new(P) runtime.SetFinalizer(p, Print) runtime.SetFinalizer(*p, Print) runtime.SetFinalizer(**p, Print) } func main() { Allocate() for i := 0; i < 5; i++ { runtime.GC() runtime.Gosched() } }
Recursive function types are actually useful: they can implement a
state machine.
type F func(*State) F type State int func Begin(s *State) F { *s = 1 return Middle } func Middle(s *State) F { *s++ if *s >= 10 { return End } return Middle }
// +build ignore
package main
import "fmt"
// 1 START OMIT
type F func(*State) F
type State int
func Begin(s *State) F {
*s = 1
return Middle
}
func Middle(s *State) F {
*s++
if *s >= 10 {
return End
}
return Middle
}
// 1 END OMIT
func End(s *State) F { fmt.Println(*s) return nil } func main() { var f F = Begin var s State for f != nil { f = f(&s) } }
Simple rule: all names at package scope are visible in the entire
package.
Complex consequence: compiler must handle recursive types (also
recursive initializers).
Go has both typed and untyped constants. They follow the same rules,
except that a typed constant must be representable in its type.
This is reasonably clear for integers, less so for floats.
// +build ignore
package main
import "fmt"
const C1 = 1e-323 const C2 = C1 / 100 const C3 = C2 * 100 const C4 float64 = C1 / 100 const C5 = C4 * 100 func main() { fmt.Println(C3, C5) }
Go's floating point variables follow IEEE-754 rules.
Constants do not.
// +build ignore
package main
import "fmt"
const C1 = 1e+308 const C2 = C1 * 10 const C3 = C2 / 10 var V1 = C1 var V2 = V1 * 10 var V3 = V2 / 10 func main() { fmt.Println(C3, V3) }
The special unsafe.Sizeof function returns a constant.
// +build ignore
package main
import (
"fmt"
"unsafe"
)
var V1 = 0x01020304 var V2 [unsafe.Sizeof(V1)]byte func main() { *(*int)(unsafe.Pointer(&V2)) = V1 fmt.Println(V2) }
Simple rule: constants are untyped; they are mathematically exact and
do not require type conversions.
Complex consequence: exact floating point behavior depends on the
type.
Name lookup in a Go compiler is simple compared to many languages.
For every name the scope in which to look it up is obvious. This
makes parsing Go quite simple.
With one exception. What is the scope for i?
func main() { i := 1 f := func() T { return T{ i: 1, } } fmt.Println(i, f()) }
One possibility.
// +build ignore
package main
import "fmt"
func main() { i := 1 f := func() T { return T{ i: 1, } } fmt.Println(i, f()) } type T map[int]int
Another possibility.
// +build ignore
package main
import "fmt"
func main() { i := 1 f := func() T { return T{ i: 1, } } fmt.Println(i, f()) } type T struct{ i int }
Simple rule: in a struct composite literal you can use field names as
keys.
Complex consequence: if you don't know the type of the composite
literal, the lookup scope of names used as keys is unclear when
parsing.
Any named type can have methods. Any struct type can inherit methods
from an embedded field. It follows that you can sometimes call
methods on a variable even if it has an unnamed type.
// +build ignore
package main
import (
"fmt"
"os"
)
var V = struct { name string os.FileMode }{ name: "hello.go", } func main() { fmt.Println(V) }
Simple rules: named types can have methods; structs can have embedded
fields.
Complex consequence: unnamed types can have methods.
18