[gambit-list] Dumping the heap

Discussion:

Dimitris Vyzovitis

2018-01-29 18:00:46 UTC

Permalink

Is there a reasonable way to dump the live heap?
It could help a lot with debugging memory leaks.

-- vyzo

Marc Feeley

2018-01-31 13:08:34 UTC

Permalink

Yes I remember helping Guillaume Cartier develop a procedure to do this. Perhaps he can help you with that.

Marc

Post by Dimitris Vyzovitis
Is there a reasonable way to dump the live heap?
It could help a lot with debugging memory leaks.
-- vyzo

Guillaume Cartier

2018-01-31 14:03:18 UTC

Permalink

Yes Marc wrote some very nice code to explore the Gambit heap
programmaticaly that I use in my projects. I'll refresh myself on the code
and post it in a Gambit friendly format shortly.

Guillaume

Post by Marc Feeley
Yes I remember helping Guillaume Cartier develop a procedure to do this.
Perhaps he can help you with that.
Marc

Post by Dimitris Vyzovitis
Is there a reasonable way to dump the live heap?
It could help a lot with debugging memory leaks.
-- vyzo

_______________________________________________
Gambit-list mailing list
https://webmail.iro.umontreal.ca/mailman/listinfo/gambit-list

Dimitris Vyzovitis

2018-01-31 14:23:35 UTC

Permalink

awesome, thank you!

-- vyzo

Post by Guillaume Cartier
Yes Marc wrote some very nice code to explore the Gambit heap
programmaticaly that I use in my projects. I'll refresh myself on the code
and post it in a Gambit friendly format shortly.
Guillaume

Post by Marc Feeley
Yes I remember helping Guillaume Cartier develop a procedure to do this.
Perhaps he can help you with that.
Marc

Post by Dimitris Vyzovitis
Is there a reasonable way to dump the live heap?
It could help a lot with debugging memory leaks.
-- vyzo

_______________________________________________
Gambit-list mailing list
https://webmail.iro.umontreal.ca/mailman/listinfo/gambit-list

Marc Feeley

2018-02-01 13:16:28 UTC

Permalink

thanks Guillaume!
this is a great start for me -- i am helping fare debug a memory leak, and it's really hard to identify
without dumping the heap to see what kind of object is leaking.

For your information I discovered a few memory leaks with the networking functions. They were due to “sockaddr” structures being converted to “still” Scheme objects with a reference count = 1, but the reference count was never decremented (with ___release_scmobj). This has been fixed in the recent UDP commit.

I believe that this kind of situation might exist in other places in the runtime system. So it might be useful to debug this to have a function that returns a list of all the “still” Scheme objects that have a reference count != 0. This should be easy to write… the GC maintains a list of the still objects in the C variable “still_objs”.

So the idea would be to check at the end of a program if there are any still objects with non-zero ref counts.

Marc

Dimitris Vyzovitis

2018-02-02 11:23:33 UTC

Permalink

Relevant code for accounting still objects:
https://gist.github.com/vyzo/ab4219382c0870779991d4c701921d2c

The limitation is that the still_objs_ is per processor, and not vm-wide.
Does that mean we would have to crawl all processors in SMP?

-- vyzo

thanks Guillaume!
this is a great start for me -- i am helping fare debug a memory leak,

and it's really hard to identify

without dumping the heap to see what kind of object is leaking.

For your information I discovered a few memory leaks with the networking
functions. They were due to âsockaddrâ structures being converted to
âstillâ Scheme objects with a reference count = 1, but the reference count
was never decremented (with ___release_scmobj). This has been fixed in the
recent UDP commit.
I believe that this kind of situation might exist in other places in the
runtime system. So it might be useful to debug this to have a function
that returns a list of all the âstillâ Scheme objects that have a reference
count != 0. This should be easy to writeâŠ the GC maintains a list of the
still objects in the C variable âstill_objsâ.
So the idea would be to check at the end of a program if there are any
still objects with non-zero ref counts.
Marc

Marc Feeley

2018-02-02 12:41:14 UTC

Permalink

Yes each processor has its own still_objs list and to account for all still objects you must iterate over the processors. In order to avoid modification of the still_objs lists while doing this the best approach is to use the barrier operation mechanism. That way all processors (but one) will be idle while iterating (or you could have all processors cooperate). This is done with the “on_all_processors” function. For an example, check out ___garbage_collect or ___fdset_resize in lib/setup.c .

Marc

Post by Dimitris Vyzovitis
https://gist.github.com/vyzo/ab4219382c0870779991d4c701921d2c
The limitation is that the still_objs_ is per processor, and not vm-wide.
Does that mean we would have to crawl all processors in SMP?
-- vyzo

thanks Guillaume!
this is a great start for me -- i am helping fare debug a memory leak, and it's really hard to identify
without dumping the heap to see what kind of object is leaking.

For your information I discovered a few memory leaks with the networking functions. They were due to “sockaddr” structures being converted to “still” Scheme objects with a reference count = 1, but the reference count was never decremented (with ___release_scmobj). This has been fixed in the recent UDP commit.
I believe that this kind of situation might exist in other places in the runtime system. So it might be useful to debug this to have a function that returns a list of all the “still” Scheme objects that have a reference count != 0. This should be easy to write… the GC maintains a list of the still objects in the C variable “still_objs”.
So the idea would be to check at the end of a program if there are any still objects with non-zero ref counts.
Marc

Dimitris Vyzovitis

2018-02-02 12:49:14 UTC

Permalink

it would be nice to have a primitive to do this for Scheme procedures!
Something like (on-all-processors thunk) would be awesome.

-- vyzo

Post by Marc Feeley
Yes each processor has its own still_objs list and to account for all
still objects you must iterate over the processors. In order to avoid
modification of the still_objs lists while doing this the best approach is
to use the barrier operation mechanism. That way all processors (but one)
will be idle while iterating (or you could have all processors cooperate).
This is done with the âon_all_processorsâ function. For an example, check
out ___garbage_collect or ___fdset_resize in lib/setup.c .
Marc

thanks Guillaume!
this is a great start for me -- i am helping fare debug a memory leak,

and it's really hard to identify