User Tools

Site Tools


zigpointers

zig 2019-08-20

15:57 donpdonp sorry if this is only partial information - ive got for(mySlice) |thing| { var newOtherThing = OtherThing.init(allocator); warn("otherThing {*}\n", &newOtherThing); }
15:58 donpdonp and I get 'OtherThing()@7ffd5d02efb0' 'OtherThing()@7ffd5d02efb0' 'OtherThing()@7ffd5d02efb0' all the same address
15:58 Tetralux The OtherThing is on the stack.
15:58 Tetralux So each iteration of the loop
15:59 Tetralux It's in the same place.
15:59 Tetralux (because the previous instances are being overwritten)
15:59 Tetralux (because the for loop scope closed and reopens)
15:59 fengb newOtherThing is a pointer
15:59 fengb You're getting &newOtherThing, which is a pointer to pointer, which is on the stack
15:59 donpdonp am I looking at the wrong thing then? i'm trying to see the address of the newly created OtherThing, not the address where the pointer is stored
16:00 Tetralux If the init fn gives back a ptr to OtherThing... which is a little weird... but if it does
16:00 fengb You should do it without the ampersand: warn("otherThing {*}\n", newOtherThing);
16:00 Tetralux You don't wanna do &
16:00 donpdonp its return Self{.params=..} so not a pointer
16:00 donpdonp i was following ArrayList() is a guide
16:01 Tetralux Yeah - not a pointer, so the thing you return goes on the stack.
16:01 Tetralux .. into the stack var newOtherThing.
16:01 donpdonp if I dont do &otherThing then I cant "{*}" and if I warn("{}", otherThing) I dont get an address
16:02 Tetralux That's correct.
16:02 Tetralux Why do you want it's pointer?
16:02 donpdonp how do I verify that init() is returning new, uniqe otherThings
16:02 fengb What's the return type for OtherThing.init?
16:02 fengb It should be returning a pointer
16:02 Tetralux It will, because it's returning a value.
16:02 Tetralux Values are always unique.
16:02 Tetralux In this case, you make one, put it on the stack, and then throw it away at the end of the loop scope.
16:03 fengb Oh wait. We're getting pointers and structs mixed up
16:03 fengb ArrayList exists on the stack, but the internal data is a pointer
16:03 donpdonp init is the standard -> type pub fn init(allocator: *Allocator) Self { return Self{.params=...}}
16:03 fengb So if you're following ArrayList, it'll probably return the same pointer, but each body would be different (and overwritten so thus leaking memory)
16:03 Tetralux fengb: It's not an ArrayList IIUC.
16:04 Tetralux donpdonp: What you're seeing is the expected behavior of what you're asking it to do.
16:05 Tetralux The ptrs are the same because you're throwing away the thing you init every time you go through the loop, because it's on the stack.
16:05 Tetralux So it just "falls off" the end.
16:05 donpdonp hmms
16:05 Tetralux And then
16:06 Tetralux The next iteration through the loop
16:06 Tetralux It put the variable in the exact same place as it did the last time.
16:06 Tetralux So it's address is the same.
16:06 donpdonp the local var is getting over written, sure, but the warn should show unique addresses for each value from init()? then again if im printing the local var then I can see how its the same
16:06 fengb init returns data into the same address
16:06 donpdonp which comes back around to how do i get the address of the struct returned from init() to see that each is different
16:06 fengb So the data is probably different, but the location is always the same
16:07 fengb The address of the struct returned is always where the variable is declared
16:07 Tetralux I don't think the ptr is the thing you actually want to check.
16:07 Tetralux If you get rid of the loop
16:07 Tetralux And just init several things into several vars
16:07 Tetralux `var x1 = init(); var x2 = init();` et
16:07 Tetralux Then the ptrs will different.
16:08 donpdonp yup, i made a test {} block to try it and it works
16:08 Tetralux The vars will be next to each other in memory.
16:08 Tetralux But that tells you nothing about the content of the struct.
16:08 donpdonp i just want to see that they're new and on the heap
16:08 Tetralux They aren't on the heap.
16:08 Tetralux They're on the stack.
16:09 Tetralux If you want them on the heap
16:09 fengb They're only on the heap if there's an explicit allocator.init() or allocator.create()
16:09 fengb (Assuming the allocator is heap based too)
16:09 Tetralux do `var newThing = allocator.create(OtherThing); newThing.* = OtherThing.init(...);`
16:10 donpdonp hmms
16:10 Tetralux (.create returns undefined memory, so you want to be sure to do the `newThing.* = ...` immediately afterwards.
16:10 donpdonp by passing the allocator to init  I assumed it was doing that
16:10 Tetralux No.
16:10 Tetralux That's for when you want the struct itself to be able to allocate stuff on the heap.
16:11 donpdonp <lightbulb on>
16:11 Tetralux ArrayList needs that because when you add stuff to it, it may not have enough space, and so needs to ask for more... from the allocator.
16:11 fengb Yeah, following the ArrayList, the actual struct is on the stack, but the contents you append goes on the heap
16:11 Tetralux The lesson is just that there's nothing magical going on when you pass an allocator to anything ;)
16:12 Tetralux It's literally the same as if you pass any other kind of value.
16:12 Tetralux It's just that it's a thing that gives you an interface to asking for more memory from somwhere.
16:12 Tetralux :D
16:12 donpdonp nod. the params say nothing about how the struct itself is created
16:13 Tetralux No, but that's because _values_ are always on the stack.
16:13 donpdonp yet for init to return a Self, that says hey a Self is being created.
16:13 donpdonp ah.
16:13 Tetralux UNLESS
16:13 donpdonp recoils in horror
16:13 Tetralux You use an allocator.
16:13 Tetralux :p
16:13 donpdonp :p
16:13 Tetralux Note though
16:14 Tetralux That it's possible to have an allocator that allocates from the stack.
16:14 Tetralux (See mem.FixedBufferAllocator)
16:14 Tetralux You can do `var buf: [1024]u8 = undefined; var allocator = FixedBufferAllocator.init(buf); var thing = allocator.create(OtherThing); thing.* = OtherThing.init(...);
16:14 Tetralux The OtherThing is on the stack now.
16:15 Tetralux But that's okay
16:15 donpdonp oh ok i only use off-the-shelf allocators :)
16:15 Tetralux xD
16:16 Tetralux The above is faster, but in my example, you can only ask for 1024 bytes of space.
16:16 donpdonp var oThing = allocator.create(otherthing); oThing.init('name'); in that case there is no reason for init to return a Self, it should be pub fn init(self: Self, param1: string) void
16:17 Tetralux You _could_ do that, but I don't know why you would.
16:17 Tetralux Oh wait
16:17 Tetralux I read that wrong.
16:17 Tetralux That would work yeah, but that's the conventiono.
16:17 Tetralux convention*
16:18 Tetralux (You'd want `self: *Self` though)
16:18 donpdonp so I just picked the wrong std class in zig to use as a model!
16:18 Tetralux XDD
16:18 donpdonp sighs
16:18 Tetralux The convention for that is this@
16:18 Tetralux this:
16:18 fengb Yeah it’s a bit confusing that ArrayList is partially in the stack
16:18 Tetralux `var t = allocator.create(Thing); t.* = OtherThing.init()`
16:19 Tetralux That way
16:19 andrewrk what's this about array list being on the stack?
16:19 donpdonp many classes appear to have fn init() Self { .. } I picked Base64Encoder by random and it follows the same pattery.
16:19 donpdonp pattern.
16:19 fengb The struct that holds the structure is on the stack, but the data it holds is in the heap
16:19 andrewrk ArrayList does not have any preallocated items
16:19 Tetralux donpdonp: What other programming language are you used to?
16:20 Tetralux Like - the one before Zig?
16:20 donpdonp a few, Go would be closest.
16:21 Tetralux In Go, did you use the pattern of .Init() where it'd return `&Self{}` ?
16:21 donpdonp looks up some old go
16:22 donpdonp func ListFactory() List { thing := List{}; return thing ; }
16:22 fengb I tended to use pointers everywhere in Go because they behaved closer to what I had wanted :/
16:23 Tetralux donpdonp: Okay yeah - that factory is exactly what `return Self {}` does.
16:23 Tetralux (in Zig.)
16:23 Tetralux It's been a while since I used Go, put I'm pretty sure that goes on the stack.
16:24 Tetralux In Go, you can do something like `return &List{}` in the factory, which heap allocates wherever it feels like.
16:24 Tetralux In Zig, you do allocator.create() instead.
16:26 Tetralux The &T{} is fairly common in Go because of the GC, for initting a struct
16:26 andrewrk pointers are liabilities in every language, the problem just manifests differently
16:26 andrewrk in garbage collected languages, pointers put more pressure on the GC, increasing the duration of the stop-the-world freeze during collection
16:26 andrewrk in rust, pointers cause difficulties with the borrow checker, making it difficult to get past the compiler
16:27 andrewrk in zig, pointers require careful management to avoid serious bugs, and obtaining pointers to heap memory is non-trivial
16:28 companion_cube I'd say rust makes a big difference between references (safe, scoped pointers) and actual raw pointers (and smart pointers)
16:28 companion_cube (the latter being normal values with move semantics)
16:28 donpdonp given fn doit() void { var thing = OtherThing.init(); } and assuming OtherThing uses  pub init() Self { return Self{..}; } , then the struct created by Init will never outlive doit() ?
16:28 donpdonp the new OtherThing lives in the stack for doit() ?
16:29 Tetralux Yes.
16:29 andrewrk donpdonp, the variable `thing` dies at the end of the block scope that it is in
16:29 andrewrk as soon as it hits the }
16:29 Tetralux That's actually a good way of thinking about it.
16:30 andrewrk that's true of all variables
16:30 Tetralux > Stack vars die at the end of the scope, unless you heap allocated it.
16:30 andrewrk you will note that global variables never hit a }, and therefore never die
16:30 donpdonp the lifetime of thing is clear, its just the lifetime of the struct that I am fuzzy on.
16:30 andrewrk Tetralux, there's no "unless". local variables die at the end of the scope
16:30 andrewrk donpdonp, there's no lifetime of types. only variables
16:31 Tetralux andrewrk: True; 'local' is the important thing there.
16:31 donpdonp hmms
16:31 fengb structs aren't objects. They don't live around. The memory will get reclaimed as soon as it disappears from the scope
16:31 Tetralux init doesn't allocate on the heap, so you put an instance of the struct into the var you declared.
16:32 donpdonp yes that I get but the Struct{} was created inside init - im surprised it even has access to the stack for doit()
16:32 Tetralux It just returns a Self, a value, and puts it in the var.
16:32 Tetralux It returns by value.
16:32 Tetralux So doit gets a value.
16:32 andrewrk donpdonp, in your example, the init() function is guaranteed to directly initialize `thing`
16:32 andrewrk that's the "result location" concept
16:33 Tetralux It's exactly like how in Go, when you `return thing;` you return the value you created in ListFactory.
16:33 donpdonp when a struct is returned by value, is the entire struct copied onto the stack ?
16:33 Tetralux Yes.
16:33 Tetralux BUT
16:34 donpdonp oh no, all caps :)
16:34 Tetralux Zig has "result-location semantics"
16:34 Tetralux As andrewrk mentioned
16:34 Tetralux This means that the copy is skipped in this case.
16:34 Tetralux It just writes directly into `thing`
16:34 Tetralux .. rather than returning and copying the value across.
16:34 Tetralux In C, it'd do the copy.
16:34 Tetralux I believe it would in Go too.
16:35 donpdonp i see.
16:35 andrewrk I don't know what C defines but clang has result locations for this case as well
16:35 fengb Clang would optimize it out too
16:35 Tetralux Oh okay - great! :D
16:35 Tetralux Didn't know that.
16:35 fengb And if it didn't, LLVM would as well >_>
16:35 Tetralux Even without optimizations?
16:35 andrewrk yeah. the tricky part is getting it to work with comptime...
16:35 fengb Optimizers all the way down
16:35 andrewrk Tetralux, yes without optimizations. I researched this before embarking on result location semantics
16:36 Tetralux Huh. Neat.
16:36 donpdonp so var thing creates storage for the entire struct inthe stack of doit() and init writes into that.
16:36 andrewrk donpdonp, precisely
16:36 Tetralux It's as if you did this:
16:36 Tetralux var thing: OtherThing = undefined; \
16:36 Tetralux OtherThing.init(&thing);
16:36 Tetralux (in some sense.)
16:37 donpdonp yes i was just going to say that :)
16:37 Tetralux ;D
16:38 donpdonp init() itself is a function call, with its own stack, and before this callsite semantic, it would have allocated an OtherThing in init's stack, which would immediately disappear because init ends after return Self{}
16:39 Tetralux Yes, BUT, because you're returning it, it'd copy it into the `thing`.
16:39 fengb It wouldn't disappear. Previously it creates a copy upon return, and copy it into the result location
16:40 Tetralux fengb: *finger guns*
16:40 donpdonp ah, the Self{..} in init would be in the init callstack memory, then copied into the spot inthe callstack for the return value.
16:41 fengb yep
16:41 donpdonp but now its copied into the address of what its being assign to.
16:42 Tetralux Indeed.
16:42 Tetralux Except "copy" = "write"
16:42 Tetralux So there's no copy.
16:43 donpdonp well that helps a lot thx.
16:44 Tetralux o7
16:45 donpdonp i not saying I would write it  but ive toyed with the idea of a zig basics document and came up with a title: "Zig Pointers" get it?  
16:46 donpdonp to answer my own other question about the "address" of the return value of init, there is no other address other than &thing because the struct is copied into the ram that was created on the stack for thing (being a var thing: OtherThing and not a pointer)
16:47 donpdonp it really does work more like a param to init() than thing = init()
16:49 andrewrk yes
16:50 Tetralux Indeed - though that part about it really working like a param to init; I should note that's _because_ init returns a _value_. ;)
16:50 Tetralux Any fn that returns a value will work like that.
16:51 donpdonp aaand my problem is fixed now by using allocator.create() \o/ \o/ im getting nice unique structs at OtherThing()@21f6590 OtherThing()@2342100. Otherthing is actually a Mastodon Toot for https://donpdonp.github.io/zootdeck/
16:51 Tetralux Curious :3
16:55 donpdonp https://github.com/donpdonp/zootdeck/commit/3c560e93b59ba012172f93fa6146807bb39d9c92#diff-6b84dc53ad829a7130c5542e878bd8eeL28
16:55 donpdonp that shows the two kinds of inits. if I var thing = allocator.create(OtherThing) and thing = OtherThing.init() could I keep using fn init() Self{ return Self{} } and it would be copied into the heap?
16:56 Tetralux YES
16:56 donpdonp id argue that that shortcut is actually not a great idea because here I am reassigning it and yet its dependent on the previous value
16:57 Tetralux The thing you get back from .create is undefined.
16:57 andrewrk donpdonp, for the heap one it should be `const thing_ptr` not `var thing`
16:57 Tetralux So you HAVE to set it.
16:57 donpdonp oh. interesting.
16:57 Tetralux And yes, as andrewrk said, you get a *OtherThing back from create.
16:57 andrewrk and then to initialize it you would do thing_ptr.* = init()
16:57 Tetralux ^^
16:57 Tetralux andrewrk: beat me to it :p
16:57 donpdonp okay so THATS the pattern Im missing.
16:57 Tetralux HUZZAH
16:58 Tetralux BY JOVE I THINK THEY HAVE IT.
16:58 andrewrk donpdonp, does it make sense why it would be const?
16:58 donpdonp well if thing_ptr were only assigned to once, that would keep later code from changing it?
16:58 andrewrk correct
16:58 andrewrk just one less accident to worry about
16:59 donpdonp 'The thing you get back from .create is undefined.' thats still puzzling.
17:00 donpdonp its just the amount of bytes for the struct, but not initialized (0xaaaaaa)
17:00 andrewrk donpdonp, consider this: var thing: Thing = undefined; const thing_ptr = &thing; thing_ptr.* = init();
17:02 andrewrk using an allocator is the same thing; you're just skipping to the second line, and the memory is on the heap
17:04 donpdonp nods
17:07 fengb andrewrk: is there a way to generate a lookup table at comptime? Like inline loop inside a switch...
17:08 andrewrk fengb, you might get some ideas from this: https://andrewkelley.me/post/string-matching-comptime-perfect-hashing-zig.html
17:10 fengb My usecase is each field has a unique int and I need to jump back into the field
17:11 donpdonp okay now im back to the canonical fn init() Self  https://github.com/donpdonp/zootdeck/commit/246dfd015d5fd5aa489afa24f66ce6f964fadc9a
17:14 fengb Hmm... would it be expensive to loop through per match?
17:16 fengb I guess I should test it out
17:16 andrewrk fengb, you should be able to take advantage of comptime fn call caching
17:16 Tetralux donpdonp: Note that it's undefined in the sense that it's not initialized to anything in particular; it's _just_ the correct number of bytes casted to an OtherThing, and then you get a pointer to it. So you cannot, and should rely on it to be in any particular state until you set it.
17:16 donpdonp nods
17:18 Tetralux raises both arms above my head
17:18 Tetralux YAY
17:18 Tetralux Making progress :p
17:20 donpdonp well thats enough Zig for today. thx again Tetralux, andrewrk for helping me level-up in zig.
  • Other interesting bits

https://theartofmachinery.com/2019/08/12/c_const_isnt_for_performance.html

zigpointers.txt · Last modified: 2024/01/31 04:08 by 127.0.0.1