zigpointers
zig 2019-08-20
15:57 donpdonp sorry if this is only partial information - ive got for(mySlice) |thing| { var newOtherThing = OtherThing.init(allocator); warn("otherThing {*}\n", &newOtherThing); } 15:58 donpdonp and I get 'OtherThing()@7ffd5d02efb0' 'OtherThing()@7ffd5d02efb0' 'OtherThing()@7ffd5d02efb0' all the same address 15:58 Tetralux The OtherThing is on the stack. 15:58 Tetralux So each iteration of the loop 15:59 Tetralux It's in the same place. 15:59 Tetralux (because the previous instances are being overwritten) 15:59 Tetralux (because the for loop scope closed and reopens) 15:59 fengb newOtherThing is a pointer 15:59 fengb You're getting &newOtherThing, which is a pointer to pointer, which is on the stack 15:59 donpdonp am I looking at the wrong thing then? i'm trying to see the address of the newly created OtherThing, not the address where the pointer is stored 16:00 Tetralux If the init fn gives back a ptr to OtherThing... which is a little weird... but if it does 16:00 fengb You should do it without the ampersand: warn("otherThing {*}\n", newOtherThing); 16:00 Tetralux You don't wanna do & 16:00 donpdonp its return Self{.params=..} so not a pointer 16:00 donpdonp i was following ArrayList() is a guide 16:01 Tetralux Yeah - not a pointer, so the thing you return goes on the stack. 16:01 Tetralux .. into the stack var newOtherThing. 16:01 donpdonp if I dont do &otherThing then I cant "{*}" and if I warn("{}", otherThing) I dont get an address 16:02 Tetralux That's correct. 16:02 Tetralux Why do you want it's pointer? 16:02 donpdonp how do I verify that init() is returning new, uniqe otherThings 16:02 fengb What's the return type for OtherThing.init? 16:02 fengb It should be returning a pointer 16:02 Tetralux It will, because it's returning a value. 16:02 Tetralux Values are always unique. 16:02 Tetralux In this case, you make one, put it on the stack, and then throw it away at the end of the loop scope. 16:03 fengb Oh wait. We're getting pointers and structs mixed up 16:03 fengb ArrayList exists on the stack, but the internal data is a pointer 16:03 donpdonp init is the standard -> type pub fn init(allocator: *Allocator) Self { return Self{.params=...}} 16:03 fengb So if you're following ArrayList, it'll probably return the same pointer, but each body would be different (and overwritten so thus leaking memory) 16:03 Tetralux fengb: It's not an ArrayList IIUC. 16:04 Tetralux donpdonp: What you're seeing is the expected behavior of what you're asking it to do. 16:05 Tetralux The ptrs are the same because you're throwing away the thing you init every time you go through the loop, because it's on the stack. 16:05 Tetralux So it just "falls off" the end. 16:05 donpdonp hmms 16:05 Tetralux And then 16:06 Tetralux The next iteration through the loop 16:06 Tetralux It put the variable in the exact same place as it did the last time. 16:06 Tetralux So it's address is the same. 16:06 donpdonp the local var is getting over written, sure, but the warn should show unique addresses for each value from init()? then again if im printing the local var then I can see how its the same 16:06 fengb init returns data into the same address 16:06 donpdonp which comes back around to how do i get the address of the struct returned from init() to see that each is different 16:06 fengb So the data is probably different, but the location is always the same 16:07 fengb The address of the struct returned is always where the variable is declared 16:07 Tetralux I don't think the ptr is the thing you actually want to check. 16:07 Tetralux If you get rid of the loop 16:07 Tetralux And just init several things into several vars 16:07 Tetralux `var x1 = init(); var x2 = init();` et 16:07 Tetralux Then the ptrs will different. 16:08 donpdonp yup, i made a test {} block to try it and it works 16:08 Tetralux The vars will be next to each other in memory. 16:08 Tetralux But that tells you nothing about the content of the struct. 16:08 donpdonp i just want to see that they're new and on the heap 16:08 Tetralux They aren't on the heap. 16:08 Tetralux They're on the stack. 16:09 Tetralux If you want them on the heap 16:09 fengb They're only on the heap if there's an explicit allocator.init() or allocator.create() 16:09 fengb (Assuming the allocator is heap based too) 16:09 Tetralux do `var newThing = allocator.create(OtherThing); newThing.* = OtherThing.init(...);` 16:10 donpdonp hmms 16:10 Tetralux (.create returns undefined memory, so you want to be sure to do the `newThing.* = ...` immediately afterwards. 16:10 donpdonp by passing the allocator to init I assumed it was doing that 16:10 Tetralux No. 16:10 Tetralux That's for when you want the struct itself to be able to allocate stuff on the heap. 16:11 donpdonp <lightbulb on> 16:11 Tetralux ArrayList needs that because when you add stuff to it, it may not have enough space, and so needs to ask for more... from the allocator. 16:11 fengb Yeah, following the ArrayList, the actual struct is on the stack, but the contents you append goes on the heap 16:11 Tetralux The lesson is just that there's nothing magical going on when you pass an allocator to anything ;) 16:12 Tetralux It's literally the same as if you pass any other kind of value. 16:12 Tetralux It's just that it's a thing that gives you an interface to asking for more memory from somwhere. 16:12 Tetralux :D 16:12 donpdonp nod. the params say nothing about how the struct itself is created 16:13 Tetralux No, but that's because _values_ are always on the stack. 16:13 donpdonp yet for init to return a Self, that says hey a Self is being created. 16:13 donpdonp ah. 16:13 Tetralux UNLESS 16:13 donpdonp recoils in horror 16:13 Tetralux You use an allocator. 16:13 Tetralux :p 16:13 donpdonp :p 16:13 Tetralux Note though 16:14 Tetralux That it's possible to have an allocator that allocates from the stack. 16:14 Tetralux (See mem.FixedBufferAllocator) 16:14 Tetralux You can do `var buf: [1024]u8 = undefined; var allocator = FixedBufferAllocator.init(buf); var thing = allocator.create(OtherThing); thing.* = OtherThing.init(...); 16:14 Tetralux The OtherThing is on the stack now. 16:15 Tetralux But that's okay 16:15 donpdonp oh ok i only use off-the-shelf allocators :) 16:15 Tetralux xD 16:16 Tetralux The above is faster, but in my example, you can only ask for 1024 bytes of space. 16:16 donpdonp var oThing = allocator.create(otherthing); oThing.init('name'); in that case there is no reason for init to return a Self, it should be pub fn init(self: Self, param1: string) void 16:17 Tetralux You _could_ do that, but I don't know why you would. 16:17 Tetralux Oh wait 16:17 Tetralux I read that wrong. 16:17 Tetralux That would work yeah, but that's the conventiono. 16:17 Tetralux convention* 16:18 Tetralux (You'd want `self: *Self` though) 16:18 donpdonp so I just picked the wrong std class in zig to use as a model! 16:18 Tetralux XDD 16:18 donpdonp sighs 16:18 Tetralux The convention for that is this@ 16:18 Tetralux this: 16:18 fengb Yeah it’s a bit confusing that ArrayList is partially in the stack 16:18 Tetralux `var t = allocator.create(Thing); t.* = OtherThing.init()` 16:19 Tetralux That way 16:19 andrewrk what's this about array list being on the stack? 16:19 donpdonp many classes appear to have fn init() Self { .. } I picked Base64Encoder by random and it follows the same pattery. 16:19 donpdonp pattern. 16:19 fengb The struct that holds the structure is on the stack, but the data it holds is in the heap 16:19 andrewrk ArrayList does not have any preallocated items 16:19 Tetralux donpdonp: What other programming language are you used to? 16:20 Tetralux Like - the one before Zig? 16:20 donpdonp a few, Go would be closest. 16:21 Tetralux In Go, did you use the pattern of .Init() where it'd return `&Self{}` ? 16:21 donpdonp looks up some old go 16:22 donpdonp func ListFactory() List { thing := List{}; return thing ; } 16:22 fengb I tended to use pointers everywhere in Go because they behaved closer to what I had wanted :/ 16:23 Tetralux donpdonp: Okay yeah - that factory is exactly what `return Self {}` does. 16:23 Tetralux (in Zig.) 16:23 Tetralux It's been a while since I used Go, put I'm pretty sure that goes on the stack. 16:24 Tetralux In Go, you can do something like `return &List{}` in the factory, which heap allocates wherever it feels like. 16:24 Tetralux In Zig, you do allocator.create() instead. 16:26 Tetralux The &T{} is fairly common in Go because of the GC, for initting a struct 16:26 andrewrk pointers are liabilities in every language, the problem just manifests differently 16:26 andrewrk in garbage collected languages, pointers put more pressure on the GC, increasing the duration of the stop-the-world freeze during collection 16:26 andrewrk in rust, pointers cause difficulties with the borrow checker, making it difficult to get past the compiler 16:27 andrewrk in zig, pointers require careful management to avoid serious bugs, and obtaining pointers to heap memory is non-trivial 16:28 companion_cube I'd say rust makes a big difference between references (safe, scoped pointers) and actual raw pointers (and smart pointers) 16:28 companion_cube (the latter being normal values with move semantics) 16:28 donpdonp given fn doit() void { var thing = OtherThing.init(); } and assuming OtherThing uses pub init() Self { return Self{..}; } , then the struct created by Init will never outlive doit() ? 16:28 donpdonp the new OtherThing lives in the stack for doit() ? 16:29 Tetralux Yes. 16:29 andrewrk donpdonp, the variable `thing` dies at the end of the block scope that it is in 16:29 andrewrk as soon as it hits the } 16:29 Tetralux That's actually a good way of thinking about it. 16:30 andrewrk that's true of all variables 16:30 Tetralux > Stack vars die at the end of the scope, unless you heap allocated it. 16:30 andrewrk you will note that global variables never hit a }, and therefore never die 16:30 donpdonp the lifetime of thing is clear, its just the lifetime of the struct that I am fuzzy on. 16:30 andrewrk Tetralux, there's no "unless". local variables die at the end of the scope 16:30 andrewrk donpdonp, there's no lifetime of types. only variables 16:31 Tetralux andrewrk: True; 'local' is the important thing there. 16:31 donpdonp hmms 16:31 fengb structs aren't objects. They don't live around. The memory will get reclaimed as soon as it disappears from the scope 16:31 Tetralux init doesn't allocate on the heap, so you put an instance of the struct into the var you declared. 16:32 donpdonp yes that I get but the Struct{} was created inside init - im surprised it even has access to the stack for doit() 16:32 Tetralux It just returns a Self, a value, and puts it in the var. 16:32 Tetralux It returns by value. 16:32 Tetralux So doit gets a value. 16:32 andrewrk donpdonp, in your example, the init() function is guaranteed to directly initialize `thing` 16:32 andrewrk that's the "result location" concept 16:33 Tetralux It's exactly like how in Go, when you `return thing;` you return the value you created in ListFactory. 16:33 donpdonp when a struct is returned by value, is the entire struct copied onto the stack ? 16:33 Tetralux Yes. 16:33 Tetralux BUT 16:34 donpdonp oh no, all caps :) 16:34 Tetralux Zig has "result-location semantics" 16:34 Tetralux As andrewrk mentioned 16:34 Tetralux This means that the copy is skipped in this case. 16:34 Tetralux It just writes directly into `thing` 16:34 Tetralux .. rather than returning and copying the value across. 16:34 Tetralux In C, it'd do the copy. 16:34 Tetralux I believe it would in Go too. 16:35 donpdonp i see. 16:35 andrewrk I don't know what C defines but clang has result locations for this case as well 16:35 fengb Clang would optimize it out too 16:35 Tetralux Oh okay - great! :D 16:35 Tetralux Didn't know that. 16:35 fengb And if it didn't, LLVM would as well >_> 16:35 Tetralux Even without optimizations? 16:35 andrewrk yeah. the tricky part is getting it to work with comptime... 16:35 fengb Optimizers all the way down 16:35 andrewrk Tetralux, yes without optimizations. I researched this before embarking on result location semantics 16:36 Tetralux Huh. Neat. 16:36 donpdonp so var thing creates storage for the entire struct inthe stack of doit() and init writes into that. 16:36 andrewrk donpdonp, precisely 16:36 Tetralux It's as if you did this: 16:36 Tetralux var thing: OtherThing = undefined; \ 16:36 Tetralux OtherThing.init(&thing); 16:36 Tetralux (in some sense.) 16:37 donpdonp yes i was just going to say that :) 16:37 Tetralux ;D 16:38 donpdonp init() itself is a function call, with its own stack, and before this callsite semantic, it would have allocated an OtherThing in init's stack, which would immediately disappear because init ends after return Self{} 16:39 Tetralux Yes, BUT, because you're returning it, it'd copy it into the `thing`. 16:39 fengb It wouldn't disappear. Previously it creates a copy upon return, and copy it into the result location 16:40 Tetralux fengb: *finger guns* 16:40 donpdonp ah, the Self{..} in init would be in the init callstack memory, then copied into the spot inthe callstack for the return value. 16:41 fengb yep 16:41 donpdonp but now its copied into the address of what its being assign to. 16:42 Tetralux Indeed. 16:42 Tetralux Except "copy" = "write" 16:42 Tetralux So there's no copy. 16:43 donpdonp well that helps a lot thx. 16:44 Tetralux o7 16:45 donpdonp i not saying I would write it but ive toyed with the idea of a zig basics document and came up with a title: "Zig Pointers" get it? 16:46 donpdonp to answer my own other question about the "address" of the return value of init, there is no other address other than &thing because the struct is copied into the ram that was created on the stack for thing (being a var thing: OtherThing and not a pointer) 16:47 donpdonp it really does work more like a param to init() than thing = init() 16:49 andrewrk yes 16:50 Tetralux Indeed - though that part about it really working like a param to init; I should note that's _because_ init returns a _value_. ;) 16:50 Tetralux Any fn that returns a value will work like that. 16:51 donpdonp aaand my problem is fixed now by using allocator.create() \o/ \o/ im getting nice unique structs at OtherThing()@21f6590 OtherThing()@2342100. Otherthing is actually a Mastodon Toot for https://donpdonp.github.io/zootdeck/ 16:51 Tetralux Curious :3 16:55 donpdonp https://github.com/donpdonp/zootdeck/commit/3c560e93b59ba012172f93fa6146807bb39d9c92#diff-6b84dc53ad829a7130c5542e878bd8eeL28 16:55 donpdonp that shows the two kinds of inits. if I var thing = allocator.create(OtherThing) and thing = OtherThing.init() could I keep using fn init() Self{ return Self{} } and it would be copied into the heap? 16:56 Tetralux YES 16:56 donpdonp id argue that that shortcut is actually not a great idea because here I am reassigning it and yet its dependent on the previous value 16:57 Tetralux The thing you get back from .create is undefined. 16:57 andrewrk donpdonp, for the heap one it should be `const thing_ptr` not `var thing` 16:57 Tetralux So you HAVE to set it. 16:57 donpdonp oh. interesting. 16:57 Tetralux And yes, as andrewrk said, you get a *OtherThing back from create. 16:57 andrewrk and then to initialize it you would do thing_ptr.* = init() 16:57 Tetralux ^^ 16:57 Tetralux andrewrk: beat me to it :p 16:57 donpdonp okay so THATS the pattern Im missing. 16:57 Tetralux HUZZAH 16:58 Tetralux BY JOVE I THINK THEY HAVE IT. 16:58 andrewrk donpdonp, does it make sense why it would be const? 16:58 donpdonp well if thing_ptr were only assigned to once, that would keep later code from changing it? 16:58 andrewrk correct 16:58 andrewrk just one less accident to worry about 16:59 donpdonp 'The thing you get back from .create is undefined.' thats still puzzling. 17:00 donpdonp its just the amount of bytes for the struct, but not initialized (0xaaaaaa) 17:00 andrewrk donpdonp, consider this: var thing: Thing = undefined; const thing_ptr = &thing; thing_ptr.* = init(); 17:02 andrewrk using an allocator is the same thing; you're just skipping to the second line, and the memory is on the heap 17:04 donpdonp nods 17:07 fengb andrewrk: is there a way to generate a lookup table at comptime? Like inline loop inside a switch... 17:08 andrewrk fengb, you might get some ideas from this: https://andrewkelley.me/post/string-matching-comptime-perfect-hashing-zig.html 17:10 fengb My usecase is each field has a unique int and I need to jump back into the field 17:11 donpdonp okay now im back to the canonical fn init() Self https://github.com/donpdonp/zootdeck/commit/246dfd015d5fd5aa489afa24f66ce6f964fadc9a 17:14 fengb Hmm... would it be expensive to loop through per match? 17:16 fengb I guess I should test it out 17:16 andrewrk fengb, you should be able to take advantage of comptime fn call caching 17:16 Tetralux donpdonp: Note that it's undefined in the sense that it's not initialized to anything in particular; it's _just_ the correct number of bytes casted to an OtherThing, and then you get a pointer to it. So you cannot, and should rely on it to be in any particular state until you set it. 17:16 donpdonp nods 17:18 Tetralux raises both arms above my head 17:18 Tetralux YAY 17:18 Tetralux Making progress :p 17:20 donpdonp well thats enough Zig for today. thx again Tetralux, andrewrk for helping me level-up in zig.
- Other interesting bits
https://theartofmachinery.com/2019/08/12/c_const_isnt_for_performance.html
zigpointers.txt · Last modified: 2024/01/31 04:08 by 127.0.0.1