-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: runtime: manage off-heap memory allocations #70224
Comments
I'm not entirely sure why this wouldn't work for you:
|
CC @golang/runtime |
FWIW this has come up between a few of us before (CC @dr2chase) and there are some benefits to direct support vs. @randall77's approach (which would also work). Mainly, it just smooths over some friction with interacting with non-Go managed memory, including C values, and makes them a little less error-prone to work with. Aside from that, I don't like the name Lastly, I wonder if this is something that should go in the |
Leaving aside the question of whether this should go in |
@prattmic It almost could, but |
I believe that proposed feature is valuable beyond ebpf use cases as well. The challenge with It is true that Golang has What if a package needs to expose data in (For a concrete example, consider memory-mapped |
@randall77 Thanks for the suggestion, but as you'd expect I glossed over a lot of the details that led us here. We've prototyped nearly every API we could come up with, but we keep running into potential memory safety issues. Check the two PRs I linked from if you're interested. My first attempt looked a little like what you proposed, but since this is library code, it's hard to build something flexible enough to be useful to the majority of users. The underlying types represented by this memory can be any C type, including structs, including ones with fields accessed atomically, etc. Initially we had a bunch of functions returning Also, a mapping represents a datasec, so potentially contains n amount of variables. We could implement a refcount mechanism on the // Memory is just a []byte and a bool as a readonly flag.
func MemoryPointer[T any](mm *Memory, off uint64) (*T, error)
// A Variable is carved out of a Memory at a given offset.
func VariablePointer[T any](v *Variable) (*T, error) Another thing I tried was returning a u32, _ := VariablePointer[uint32](v)
(**u32) += 1
// or
a64, _ := VariablePointer[atomic.Uint64](v)
(*a64).Add(1) This technically works, but it's clunky and requires more unnecessary pointer chasing. If we ever end up in a situation where we can return @mknyszek Thanks for the input!
This is still a rough proposal, naming should probably be decided by someone more knowledgeable about the runtime/allocator/GC. 😉
Actually... 😉 Linux' bpf uapi is now frozen, which means no new map types will be introduced solely for bringing new data types to bpf. Going forward, new shared user/kernel datatypes will need to be implemented on top of a so-called 'bpf arena', which is a 4GiB range of memory that can be mapped into user space, exactly like the array map I demonstrated above. These contain pages allocated dynamically by bpf programs. Either side can write pointers into this arena, and as long as they point to somewhere inside the arena itself, the kernel will translate the pointers between kernel and user address space. (Note: it knows which values are pointers, but that's another story) Any pointers pointing outside the arena cause an exception when dereferenced from within a bpf program. Just to add some nuance to your statement, 'cannot' should really be 'should not'. The user indeed shouldn't expect the GC to follow an off-heap pointer into the Go heap when scanning for references, so when designing these data types, care needs to be taken not to take a caller-provided Go pointer and stuff it somewhere off-heap. This property is definitely something we should document if/when working on implementing this proposal. As I understand it, the Go Arenas concept is stalled for similar reasons, though in reverse. (Go pointers into an arena can become dangling when the arena is freed)
Makes sense, but nothing in the @mejedi floated making this an implicit part of b, _ := unix.Mmap(...)
runtime.TrackPointers(unsafe.SliceData(b), len(b), func (ptr unsafe.Pointer) {
unix.Munmap(unsafe.Slice(ptr, len(b)))
}) Using |
I mean that's fine. That's just writing a non-Go pointer into that memory. Maybe I misunderstand your point.
Er, sure. You can pin the memory. This is closely related to the cgo pointer rules, so what I really meant to say was you absolutely cannot write unpinned Go pointers into C memory.
That is a fair point; it's just an idea. Mainly what I'm trying to get across is that there should be alarm bells ringing inside anyone's head when they see this function being used that something subtle is happening. The
Yeah, that would be a more specific subset of this functionality. I'm pretty sure this has also come up before a bunch of times, though I'm not certain there's an issue open for it (wouldn't be surprised if there is). Thing is, if we have mechanism in the runtime for One thing not discussed in this issue yet is the actual implementation of this functionality. That part is easy if we place restrictions on the size and shape of these regions (must be at least one physical page in size). If we want to support arbitrarily-sized things, like anything that comes from malloc, that becomes substantially harder, because we need full on byte-level tracking (although this would certainly be the right thing to do). |
For whatever this one opinion is worth, the fact that the first argument is an |
@mknyszek I've updated the original proposal and cut out all of the cruft. I've added a few paragraphs on high-level implementation based on what we've discussed offline. I landed on How can we get this proposal to be formally accepted? |
A few notes on the API.
Below is a modified API with these updates.
A single global tree is not a bad idea, but it's not clear to me that the Maple Tree would be the right data structure for us. There'd be frequent random access from the GC (so the fact that prev and succ are cache-efficient is somewhat pointless for us) and I'm a bit concerned about writers having to acquire a lock. A lock for the whole tree? Or just a subset? I'm concerned about adding a new scalability issue if it's the former. For this sort of thing we tend to prefer a radix-style index over the heap, which is how things are currently structured. Could we do both and reuse the (We have an advantage over Linux by having a GC, in that reclaiming memory from a manually managed tree structure in this case is much easier for us than in the Linux world. We can reap dead nodes after the sweep phase ends, since until the next sweep phase, no more nodes can die. No need for a bespoke RCU system.) |
The next steps are to get feedback and iterate on the proposal. Once a consensus starts to form, it would be helpful to have a prototype and learn how well it works and see if we can find some unanticipated issues with the design or API. Thanks for your efforts here! |
Proposal Details
Hi folks, I'd like to explore the possibility for the runtime to 'adopt' externally-allocated memory by tracking pointers to the span and unmapping the underlying memory if there are no more references. This opens up many interesting use cases for folks that need to interact with C or other FFI memory, or maybe even for writing custom allocators.
API
To be used as:
Implementation
Incorporating early feedback on this proposal, it would be ideal if this would support memory ranges of arbitrary sizes. As far as I can tell, the runtime doesn't have a facility for representing ranges smaller than a page, so @mknyszek suggested that a new data structure may need to be introduced.
The Linux kernel 'recently' switched its memory management over to the Maple Tree, a B-tree variant optimized for storing non-overlapping ranges, which happens to be exactly what we need as well. The kernel implementation is completely general-purpose and very complex, but we might be able to get away with a subset of it.
If the runtime has active manual memory mappings, the garbage collector would first try to find a pointer's mspan, falling through to querying the Maple Tree if that turns up nothing. During a cycle, Maple Tree nodes would be considered heap allocations, scheduling a cleanup if all references are gone.
I originally got this idea from https://pkg.go.dev/github.com/Jille/gcmmap, a package that mmaps over the Go heap using MAP_FIXED. It uses
runtime.mallocgc()
but allocating a byte slice works just as well. This approach works, but makes several hard assumptions:runtime.heapObjectsCanMove
now)PROT_READ|PROT_WRITE
andMAP_ANON|MAP_PRIVATE
Please let me know what you think. Thank you!
The text was updated successfully, but these errors were encountered: