Discussion:
[gambit-list] gsc -report for finding non-reachable defines
Sven Hartrumpf
2017-08-24 10:12:14 UTC
Permalink
Hi.

Thanks to a suggestion by Marc on the Chicken mailing list,
I am experimenting with using the output of 'gsc -report'
to remove non-reachable defines.
I assume each defined identifier reported with D and one of A or R or C
as reachable, all others as non-reachable.
The found set of non-reachable identifiers is incomplete, but
scms r-matcher.scm
cond-expand-command: ...
alexpand-command: ...
reachable-command: (gsc -:h3700000 -report r-matcher.alexp > r-matcher.alexp.gsc)
reduce-command: (cfa-reduce -k r-matcher.alexp r-matcher.reach > r-matcher.rea1)
defines: #removed: 718 (24.73%) #kept: 2185
iter 1 size: 3822249 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea1 > r-matcher.rea1.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea1 r-matcher.reach > r-matcher.rea2)
defines: #removed: 252 (11.53%) #kept: 1933
iter 2 size: 3519136 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea2 > r-matcher.rea2.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea2 r-matcher.reach > r-matcher.rea3)
defines: #removed: 150 (7.76%) #kept: 1783
iter 3 size: 3277324 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea3 > r-matcher.rea3.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea3 r-matcher.reach > r-matcher.rea4)
defines: #removed: 93 (5.22%) #kept: 1690
iter 4 size: 3154753 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea4 > r-matcher.rea4.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea4 r-matcher.reach > r-matcher.rea5)
defines: #removed: 42 (2.49%) #kept: 1648
iter 5 size: 3000686 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea5 > r-matcher.rea5.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea5 r-matcher.reach > r-matcher.rea6)
defines: #removed: 60 (3.64%) #kept: 1588
iter 6 size: 2924187 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea6 > r-matcher.rea6.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea6 r-matcher.reach > r-matcher.rea7)
defines: #removed: 26 (1.64%) #kept: 1562
iter 7 size: 2600067 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea7 > r-matcher.rea7.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea7 r-matcher.reach > r-matcher.rea8)
defines: #removed: 16 (1.02%) #kept: 1546
iter 8 size: 2586789 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea8 > r-matcher.rea8.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea8 r-matcher.reach > r-matcher.rea9)
defines: #removed: 12 (0.78%) #kept: 1534
iter 9 size: 2584936 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea9 > r-matcher.rea9.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea9 r-matcher.reach > r-matcher.rea10)
defines: #removed: 6 (0.39%) #kept: 1528
iter 10 size: 2580332 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea10 > r-matcher.rea10.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea10 r-matcher.reach > r-matcher.rea11)
defines: #removed: 1 (0.07%) #kept: 1527
iter 11 size: 2579734 bytes
reachable-command: (gsc -:h3700000 -report r-matcher.rea11 > r-matcher.rea11.gsc)
reduce-command: (cfa-reduce -k r-matcher.rea11 r-matcher.reach > r-matcher.rea12)
defines: #removed: 0 (0.00%) #kept: 1527
iter 12 size: 2579734 bytes

The loop is terminated if no change of the source file is achieved (fixed point).
Here are some questions:

1. There are still non-reachable functions in the final source file, e.g.
if two functions call each other but are never called from a third function.

2. Can the analysis by 'gsc -report' be made more aggressive or can it be hinted
by providing some identifiers that are known to be non-reachable?

3. Is there a way to stop gsc when the report is ready? Without such a feature,
the above 12-step analysis takes 20 minutes because it also generates .o files in
each iteration.

Greetings
Sven
Marc Feeley
2017-08-24 12:00:12 UTC
Permalink
A better way to determine which definitions are live is to use the tree shaker. Use (declare (optimize-dead-definitions)) and “gsc -dg” to create a dependency graph file “foo.dg” and then look at all the names in the foo.dg file (which can also be visualized with the “dot” program). You probably also want to disable inlining with (declare (not inline)) if you want to disregard the fact that definitions can become dead if they are inlined at all the call sites.

For example:

bash$ gsc -c -dg foo.scm;echo not defined:;fgrep '" [label =' foo.dg | sed -e 's/^ //g' -e 's/" \[label =.*$/"/g' | sort | uniq;echo live including not defined:;fgrep ' -> ' foo.dg | sed -e 's/^ //g' -e 's/;$//g' -e $'s/ -> /\\\n/g' | sort | uniq
not defined:
"*"
"+"
"println"
live including not defined:
"*"
"+"
"c"
"d"
"e"
"main"
"println"
"| foo|"
bash$ cat foo.scm
(declare
(optimize-dead-definitions)
(block)
(not inline)
)

(define (a x) (if (< x 100) (a (b x)) x))

(define (b x) (* x 2))

(define (c x) (d (e x)))

(define (d x) (* x x))

(define (e x) (+ x 1))

(define (main) (println (c 10)))

(main)
bash$ cat foo.dg
digraph "/Users/feeley/foo" {
graph [splines = true overlap = false rankdir = "TD"];
node [fontname = "Courier New" shape = "none"];
"println" [label = <<table border="0" cellborder="0" cellspacing="0" cellpadding="0" bgcolor="black"><tr><td><font face="Courier Bold" color="white"> println </font></td></tr></table>>];
"+" [label = <<table border="0" cellborder="0" cellspacing="0" cellpadding="0" bgcolor="black"><tr><td><font face="Courier Bold" color="white"> + </font></td></tr></table>>];
"*" [label = <<table border="0" cellborder="0" cellspacing="0" cellpadding="0" bgcolor="black"><tr><td><font face="Courier Bold" color="white"> * </font></td></tr></table>>];
"main" -> "println";
"main" -> "c";
"e" -> "+";
"d" -> "*";
"c" -> "e";
"c" -> "d";
"| foo|" -> "main";
}

The output of “dot -O -Tpdf foo.dg” is:
Marc Feeley
2017-08-24 13:54:33 UTC
Permalink
By the way, without the optimize-dead-definitions declaration you would get this dependency graph:
Sven Hartrumpf
2017-08-24 20:20:48 UTC
Permalink
Post by Marc Feeley
A better way to determine which definitions are live is to use the tree
shaker. Use (declare (optimize-dead-definitions)) and “gsc -dg” to create a
dependency graph file “foo.dg” and then look at all the names in the foo.dg
file (which can also be visualized with the “dot” program). You probably
also want to disable inlining with (declare (not inline)) if you want to
disregard the fact that definitions can become dead if they are inlined at
all the call sites.
Yas. That works better than -report. Which option should I use to
_avoid_ that defined constant values (like (define *opt #t)) that are never
changed are optimized away and hence will not appear in .dg?
At the moment, I use the following options:
(declare (r5rs-scheme) (block) (not constant-fold) (not inline) (optimize-dead-definitions) (standard-bindings) (extended-bindings))

Greetings
Sven
Marc Feeley
2017-08-24 22:16:33 UTC
Permalink
Just remove the block declaration. Not sure why I put it there in the first place…

Marc
Post by Sven Hartrumpf
Post by Marc Feeley
A better way to determine which definitions are live is to use the tree
shaker. Use (declare (optimize-dead-definitions)) and “gsc -dg” to create a
dependency graph file “foo.dg” and then look at all the names in the foo.dg
file (which can also be visualized with the “dot” program). You probably
also want to disable inlining with (declare (not inline)) if you want to
disregard the fact that definitions can become dead if they are inlined at
all the call sites.
Yas. That works better than -report. Which option should I use to
_avoid_ that defined constant values (like (define *opt #t)) that are never
changed are optimized away and hence will not appear in .dg?
(declare (r5rs-scheme) (block) (not constant-fold) (not inline) (optimize-dead-definitions) (standard-bindings) (extended-bindings))
Greetings
Sven
_______________________________________________
Gambit-list mailing list
https://webmail.iro.umontreal.ca/mailman/listinfo/gambit-list
Loading...