Exploring a code base?

unofficial mirror of help-gnu-emacs@gnu.org
 help / color / mirror / Atom feed

* Exploring a code base?
@ 2020-10-27 11:38 Yuri Khan
  2020-10-27 11:58 ` Christopher Dimech
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Yuri Khan @ 2020-10-27 11:38 UTC (permalink / raw)
  To: help-gnu-emacs

Hello list,

often, when working on a project, I encounter the following need:

* I want to refactor a data structure. It has a unique name, let’s say
Foo, so I ‘M-x grep RET git grep Foo RET’. This gives me a Grep buffer
where I can inspect each place where that type is used explicitly.

* I find that I have a function, let’s call it make_foo, that returns
an instance of that type. There is also a consume_foo that accepts an
argument of that type. I now want to inspect all usages of those
because my refactoring may affect them. So I put point on make_foo and
invoke ‘xref-find-references’.

* This leads to more functions that return Foo. I may want to inspect
each of those recursively.

Basically what I’m doing is traversal of a graph, where nodes are type
and function definitions, and edges are relationships such as
“function <calls> function”, “function <accepts> type”, “function
<returns> type”, “type <derives from> type”, “type <aggregates> type”,
etc.

When the change I’m doing is not very invasive, the affected subgraph
fits completely in my head. However, when it doesn’t, I find myself
having to record my traversal state. I create an Org buffer and
manually maintain a queue of nodes, marking those I haven’t yet
visited with TODO and those I have with DONE. Then I pick the first
TODO, grep or xref-find-references on it, add any relevant nodes to
the queue, make the necessary changes in the code, and mark the node
DONE. Repeat until no TODO.

This is rather tedious. It feels like there should exist a better way,
maybe with a visualization of the graph structure.

What do you use to explore and map a code base and perform extensive
changes on it?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-10-27 11:38 Exploring a code base? Yuri Khan
@ 2020-10-27 11:58 ` Christopher Dimech
  2020-10-27 14:15 ` Stefan Monnier
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Christopher Dimech @ 2020-10-27 11:58 UTC (permalink / raw)
  To: Yuri Khan; +Cc: help-gnu-emacs

There is the Graph Description Language, Dot.

---------------------
Christopher Dimech
General Administrator - Naiad Informatics - GNU Project (Geocomputation)
- Geophysical Simulation
- Geological Subsurface Mapping
- Disaster Preparedness and Mitigation
- Natural Resource Exploration and Production
- Free Software Advocacy


> Sent: Tuesday, October 27, 2020 at 12:38 PM
> From: "Yuri Khan" <yuri.v.khan@gmail.com>
> To: "help-gnu-emacs" <help-gnu-emacs@gnu.org>
> Subject: Exploring a code base?
>
> Hello list,
> 
> often, when working on a project, I encounter the following need:
> 
> * I want to refactor a data structure. It has a unique name, let’s say
> Foo, so I ‘M-x grep RET git grep Foo RET’. This gives me a Grep buffer
> where I can inspect each place where that type is used explicitly.
> 
> * I find that I have a function, let’s call it make_foo, that returns
> an instance of that type. There is also a consume_foo that accepts an
> argument of that type. I now want to inspect all usages of those
> because my refactoring may affect them. So I put point on make_foo and
> invoke ‘xref-find-references’.
> 
> * This leads to more functions that return Foo. I may want to inspect
> each of those recursively.
> 
> Basically what I’m doing is traversal of a graph, where nodes are type
> and function definitions, and edges are relationships such as
> “function <calls> function”, “function <accepts> type”, “function
> <returns> type”, “type <derives from> type”, “type <aggregates> type”,
> etc.
> 
> When the change I’m doing is not very invasive, the affected subgraph
> fits completely in my head. However, when it doesn’t, I find myself
> having to record my traversal state. I create an Org buffer and
> manually maintain a queue of nodes, marking those I haven’t yet
> visited with TODO and those I have with DONE. Then I pick the first
> TODO, grep or xref-find-references on it, add any relevant nodes to
> the queue, make the necessary changes in the code, and mark the node
> DONE. Repeat until no TODO.
> 
> This is rather tedious. It feels like there should exist a better way,
> maybe with a visualization of the graph structure.
> 
> What do you use to explore and map a code base and perform extensive
> changes on it?
> 
>



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-10-27 11:38 Exploring a code base? Yuri Khan
  2020-10-27 11:58 ` Christopher Dimech
@ 2020-10-27 14:15 ` Stefan Monnier
  2020-10-27 15:55 ` Drew Adams
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Stefan Monnier @ 2020-10-27 14:15 UTC (permalink / raw)
  To: help-gnu-emacs

> When the change I’m doing is not very invasive, the affected subgraph
> fits completely in my head. However, when it doesn’t, I find myself
> having to record my traversal state. I create an Org buffer and
> manually maintain a queue of nodes, marking those I haven’t yet
> visited with TODO and those I have with DONE. Then I pick the first
> TODO, grep or xref-find-references on it, add any relevant nodes to
> the queue, make the necessary changes in the code, and mark the node
> DONE. Repeat until no TODO.
>
> This is rather tedious. It feels like there should exist a better way,
> maybe with a visualization of the graph structure.

That's a good question.  I have resorted to writing down (in some
arbitrary text file) the things that are still pending, like you
describe (even in case it fits in your head, writing it down can be
needed if you're interrupted in the middle), and indeed it's
not satisfactory.

In other cases, I try to make use of the compiler's checks to keep track
of what I still need to do.  Typically by renaming the functions/types
I still need to investigate so the compiler points me to the places that
still need to be changed (sometimes, this renaming is not really wanted
and is hence temporary, so I use a "funny" name which I can easily fix
later with a simple search&replace).

Obviously, this only works if you can rely on something like a compiler
to catch the problems.

I'd be interested to hear if someone has a good solution for that.
Maybe some way to "push" a particular file/location on a stack of
pending issues (along with some brief description, maybe) and then some
way to display it in a reasonable way...

        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* RE: Exploring a code base?
  2020-10-27 11:38 Exploring a code base? Yuri Khan
  2020-10-27 11:58 ` Christopher Dimech
  2020-10-27 14:15 ` Stefan Monnier
@ 2020-10-27 15:55 ` Drew Adams
  2020-10-27 20:56 ` Dmitry Gutov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Drew Adams @ 2020-10-27 15:55 UTC (permalink / raw)
  To: Yuri Khan, help-gnu-emacs

Great question.

(Computers are good at doing this kind of task...
Sounds like a good project.)



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-10-27 11:38 Exploring a code base? Yuri Khan
                   ` (2 preceding siblings ...)
  2020-10-27 15:55 ` Drew Adams
@ 2020-10-27 20:56 ` Dmitry Gutov
  2020-11-07 13:26   ` Yuri Khan
  2020-10-27 20:59 ` Perry Smith
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 19+ messages in thread
From: Dmitry Gutov @ 2020-10-27 20:56 UTC (permalink / raw)
  To: Yuri Khan, help-gnu-emacs

On 27.10.2020 13:38, Yuri Khan wrote:

> Basically what I’m doing is traversal of a graph, where nodes are type
> and function definitions, and edges are relationships such as
> “function <calls> function”, “function <accepts> type”, “function
> <returns> type”, “type <derives from> type”, “type <aggregates> type”,
> etc.
> 
> When the change I’m doing is not very invasive, the affected subgraph
> fits completely in my head. However, when it doesn’t, I find myself
> having to record my traversal state. I create an Org buffer and
> manually maintain a queue of nodes, marking those I haven’t yet
> visited with TODO and those I have with DONE. Then I pick the first
> TODO, grep or xref-find-references on it, add any relevant nodes to
> the queue, make the necessary changes in the code, and mark the node
> DONE. Repeat until no TODO.

Speaking of Xref, we could add some new commands: to remove items from 
the list, to undo removals. And a stacking for searches, so you could go 
back to the previous search result. Not sure how much that will help.

> This is rather tedious. It feels like there should exist a better way,
> maybe with a visualization of the graph structure.
> 
> What do you use to explore and map a code base and perform extensive
> changes on it?

I don't have a solution, personally, and I usually work in a dynamic 
language where this isn't a very feasible thing to do.

But the feature in question sounds intriguing. Here's a couple things 
for C/C++ I found with a brief search:

* https://github.com/beacoder/call-graph uses GNU Global. It has a 
tree-based Emacs interface. Could be a bit immature/use some help with 
development, looking at the issues list.

* Here's a recipe for a graphical call graph: 
https://stackoverflow.com/a/5373814/615245 It is probably not exactly 
what you wanted, but the intermediate created by Clang could serve as a 
better data source than Global if someone tried to create a new Emacs 
based UI for this.

If you find any of this useful, please share your experience.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-10-27 20:56 ` Dmitry Gutov
@ 2020-11-07 13:26   ` Yuri Khan
  2020-11-07 13:56     ` Eli Zaretskii
  2020-11-07 19:40     ` Dmitry Gutov
  0 siblings, 2 replies; 19+ messages in thread
From: Yuri Khan @ 2020-11-07 13:26 UTC (permalink / raw)
  To: Dmitry Gutov; +Cc: help-gnu-emacs

On Wed, 28 Oct 2020 at 03:56, Dmitry Gutov <dgutov@yandex.ru> wrote:

> Speaking of Xref, we could add some new commands: to remove items from
> the list, to undo removals. And a stacking for searches, so you could go
> back to the previous search result. Not sure how much that will help.

A stack/history would be very nice, yeah. To the point that I actually
caught myself trying to press ‘l’ in xref and grep result buffers.

> I don't have a solution, personally, and I usually work in a dynamic
> language where this isn't a very feasible thing to do.

“We know that it has no solution, too. But we wish to learn how to solve it.”

I’m working in Python, which is a dynamic language. Theoretically, in
Python one does not have to declare argument and return types, and the
call graph can change its structure at run time because functions are
first-class values and because of class-based polymorphism. In
practice, my project is mostly type-annotated (with mypy
sanity-checking the annotations), and the use of polymorphism and
dynamic binding is limited.

> * https://github.com/beacoder/call-graph uses GNU Global.
> * Here's a recipe for a graphical call graph:

Yeah, there exist many tools that attempt to take in the whole project
and generate a complete call graph. In my experience, most of the
time, for any project more complex than Hello World, the resulting
graph is too messy to be helpful.

For one thing, I am not interested in all calls, only those through
which a particular data type flows as an argument or [part of] the
return value. This subgraph is smaller and simpler, and actually has a
chance to embed into a plane without many edge intersections.

Eric S. Raymond once wrote up[1] the difference between automatons
(programs that attempt to solve the problem fully without human
involvement) and judgment amplifiers (programs that help the human
solve the problem by automating parts of the process). I think I’m
looking for a tool in the latter category.

[1]: http://esr.ibiblio.org/?p=7032

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-11-07 13:26   ` Yuri Khan
@ 2020-11-07 13:56     ` Eli Zaretskii
  2020-11-07 14:33       ` Gregory Heytings via Users list for the GNU Emacs text editor
  2020-11-07 19:40     ` Dmitry Gutov
  1 sibling, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2020-11-07 13:56 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Yuri Khan <yuri.v.khan@gmail.com>
> Date: Sat, 7 Nov 2020 20:26:16 +0700
> Cc: help-gnu-emacs <help-gnu-emacs@gnu.org>
> 
> Yeah, there exist many tools that attempt to take in the whole project
> and generate a complete call graph. In my experience, most of the
> time, for any project more complex than Hello World, the resulting
> graph is too messy to be helpful.
> 
> For one thing, I am not interested in all calls, only those through
> which a particular data type flows as an argument or [part of] the
> return value. This subgraph is smaller and simpler, and actually has a
> chance to embed into a plane without many edge intersections.

Maybe you could step back and describe in a bit more detail what kind
of workflow you are trying to support, and how that contributes to the
kind of code-base exploring job you have in mind.  I do this quite a
lot, and IME a combination of M-. and M-? (with the latter using the
back-end of ID Utils, if possible) is entirely adequate.  In
particular, I don't think I ever was in the need of some graph to
refactor a data type, I only needed to examine its uses.

Basically, I'm asking why having a flat list of all the users of a
data type, and reading the code of all of them, is not enough for this
kind of job?  What am I missing?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-11-07 13:56     ` Eli Zaretskii
@ 2020-11-07 14:33       ` Gregory Heytings via Users list for the GNU Emacs text editor
  2020-11-07 14:47         ` Eli Zaretskii
  0 siblings, 1 reply; 19+ messages in thread
From: Gregory Heytings via Users list for the GNU Emacs text editor @ 2020-11-07 14:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: help-gnu-emacs

>
> Maybe you could step back and describe in a bit more detail what kind of 
> workflow you are trying to support, and how that contributes to the kind 
> of code-base exploring job you have in mind.  I do this quite a lot, and 
> IME a combination of M-. and M-? (with the latter using the back-end of 
> ID Utils, if possible) is entirely adequate.  In particular, I don't 
> think I ever was in the need of some graph to refactor a data type, I 
> only needed to examine its uses.
>
> Basically, I'm asking why having a flat list of all the users of a data 
> type, and reading the code of all of them, is not enough for this kind 
> of job?  What am I missing?
>

IIUC, Yuri, Stefan and Dmitry all described the same problem, each with 
their own words: using M-. and M-? works fine for simple tasks (say, 
change the name of a given function or struct field in a codebase, or see 
where a given type is used), but when you are tackling a more complex task 
you often have to put searches on a kind of "stack", on which you can save 
your current search state, and from which you can resume your current 
search at a later stage. Say, you are making changes to the occurrences of 
"foo", and at some point during that work you see that you need to change 
something around the occurrences of "bar", and at some point during that 
work you see that you need to change the occurrences of "baz", and so 
forth.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-11-07 14:33       ` Gregory Heytings via Users list for the GNU Emacs text editor
@ 2020-11-07 14:47         ` Eli Zaretskii
  2020-11-07 15:32           ` Gregory Heytings via Users list for the GNU Emacs text editor
  0 siblings, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2020-11-07 14:47 UTC (permalink / raw)
  To: help-gnu-emacs

> Date: Sat, 07 Nov 2020 14:33:25 +0000
> From: Gregory Heytings <ghe@sdf.org>
> cc: help-gnu-emacs@gnu.org
> 
> > Basically, I'm asking why having a flat list of all the users of a data 
> > type, and reading the code of all of them, is not enough for this kind 
> > of job?  What am I missing?
> 
> IIUC, Yuri, Stefan and Dmitry all described the same problem, each with 
> their own words

I've read the entire thread, thank you.

>                 using M-. and M-? works fine for simple tasks (say, 
> change the name of a given function or struct field in a codebase, or see 
> where a given type is used), but when you are tackling a more complex task 
> you often have to put searches on a kind of "stack", on which you can save 
> your current search state, and from which you can resume your current 
> search at a later stage. Say, you are making changes to the occurrences of 
> "foo", and at some point during that work you see that you need to change 
> something around the occurrences of "bar", and at some point during that 
> work you see that you need to change the occurrences of "baz", and so 
> forth.

When that happens, it means you should change the search criteria.
For example, if you looked for a variable, but found that more than
one variable is involved, you need to search for the data type (if all
the variables use the same data type), or search for several
identifiers instead of just one.  the tools mentioned all support such
multiple searches, so I still don't think I understand the problem.

I used this technique in some pretty complex projects (as in: about 1
million code lines of sophisticated C++), and it scaled well.

As for discovering that a problem includes more places and symbols
than one originally envisioned: one can maintain a flat list of
functions/modules/whatever to look into, and add to that as more stuff
is being found.  You can do it in Org or even in a simple text buffer.
If there's really a need to maintain some structure, one can use Org's
level headings to that end.

Bottom line: I still don't see the problem, and my question above
still stands.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-11-07 14:47         ` Eli Zaretskii
@ 2020-11-07 15:32           ` Gregory Heytings via Users list for the GNU Emacs text editor
  2020-11-07 15:52             ` Stefan Monnier
  0 siblings, 1 reply; 19+ messages in thread
From: Gregory Heytings via Users list for the GNU Emacs text editor @ 2020-11-07 15:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: help-gnu-emacs


>
> As for discovering that a problem includes more places and symbols than 
> one originally envisioned: one can maintain a flat list of 
> functions/modules/whatever to look into, and add to that as more stuff 
> is being found.  You can do it in Org or even in a simple text buffer. 
> If there's really a need to maintain some structure, one can use Org's 
> level headings to that end.
>

Yes, this is almost word for word what Yuri explained in his first mail, 
and Stefan in his reply.  Yuri says that doing this is "tedious", Stefan 
that it is "not satisfactory".  So the question is: could this not be 
automated with a kind of stack or list of searches, in which one could 
navigate, instead of using a separate flat or structured text file?



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-11-07 15:32           ` Gregory Heytings via Users list for the GNU Emacs text editor
@ 2020-11-07 15:52             ` Stefan Monnier
  2020-11-07 15:58               ` Eli Zaretskii
  2020-11-07 19:23               ` Dmitry Gutov
  0 siblings, 2 replies; 19+ messages in thread
From: Stefan Monnier @ 2020-11-07 15:52 UTC (permalink / raw)
  To: help-gnu-emacs

> Yes, this is almost word for word what Yuri explained in his first mail, and
> Stefan in his reply.  Yuri says that doing this is "tedious", Stefan that it
> is "not satisfactory".  So the question is: could this not be automated with
> a kind of stack or list of searches, in which one could navigate, instead of
> using a separate flat or structured text file?

BTW, I think the solution should likely not be connected to "code" in
any way.  Fundamentally, the same problem can appear where you're
reviewing a text document (tho it's admittedly less common).

So, all that's needed, I think, is some way to be able to maintain
a stack/tree of annotated bookmarks, with a command to "add an element"
which can be used from any buffer, and then some way to view the tree
and modify it.

Most likely, all it takes is a command which adds an entry to some
"central" Org file (probably a TODO entry) and another to quickly
display that Org file.

        Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-11-07 15:52             ` Stefan Monnier
@ 2020-11-07 15:58               ` Eli Zaretskii
  2020-11-07 17:24                 ` Eric Abrahamsen
  2020-11-07 19:23               ` Dmitry Gutov
  1 sibling, 1 reply; 19+ messages in thread
From: Eli Zaretskii @ 2020-11-07 15:58 UTC (permalink / raw)
  To: help-gnu-emacs

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 07 Nov 2020 10:52:01 -0500
> 
> Most likely, all it takes is a command which adds an entry to some
> "central" Org file (probably a TODO entry) and another to quickly
> display that Org file.

FWIW, when I need such facilities, I basically have the Org file where
I keep my notes visible in some window (or even in a separate frame)
at all times.  So a "command to quickly display that Org file" boils
down to raising a frame at most, if the window is obscured.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-11-07 15:58               ` Eli Zaretskii
@ 2020-11-07 17:24                 ` Eric Abrahamsen
  0 siblings, 0 replies; 19+ messages in thread
From: Eric Abrahamsen @ 2020-11-07 17:24 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: help-gnu-emacs

Eli Zaretskii <eliz@gnu.org> writes:

>> From: Stefan Monnier <monnier@iro.umontreal.ca>
>> Date: Sat, 07 Nov 2020 10:52:01 -0500
>> 
>> Most likely, all it takes is a command which adds an entry to some
>> "central" Org file (probably a TODO entry) and another to quickly
>> display that Org file.
>
> FWIW, when I need such facilities, I basically have the Org file where
> I keep my notes visible in some window (or even in a separate frame)
> at all times.  So a "command to quickly display that Org file" boils
> down to raising a frame at most, if the window is obscured.

I use custom Org link types when editing novels or completing my own
novel translations, linking to various editing "issues", and displaying
them in a variety of ways. As Stefan notes, the need to handle multiple
interconnected changes isn't specific to code. Perhaps there could be
custom org link types that perform xref searches, or perform more
complex semantic searches like "all instances of function `foo' that
still use the old calling convention" and display using occur.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-11-07 15:52             ` Stefan Monnier
  2020-11-07 15:58               ` Eli Zaretskii
@ 2020-11-07 19:23               ` Dmitry Gutov
  1 sibling, 0 replies; 19+ messages in thread
From: Dmitry Gutov @ 2020-11-07 19:23 UTC (permalink / raw)
  To: Stefan Monnier, help-gnu-emacs

On 07.11.2020 17:52, Stefan Monnier wrote:
> So, all that's needed, I think, is some way to be able to maintain
> a stack/tree of annotated bookmarks, with a command to "add an element"
> which can be used from any buffer, and then some way to view the tree
> and modify it.

If it were some tree UI in a buffer with Xref-like interface, one could 
add branches automatically by specifying an Xref search to do (the 
results would be added as a branch to the current node), or clean up 
said branches automatically by re-running the associated search (when 
you have done all the renamings, the search would return nil, and the 
branch could be deleted, provided there are no subtrees).

This needs a lot of experimentation, though.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-11-07 13:26   ` Yuri Khan
  2020-11-07 13:56     ` Eli Zaretskii
@ 2020-11-07 19:40     ` Dmitry Gutov
  1 sibling, 0 replies; 19+ messages in thread
From: Dmitry Gutov @ 2020-11-07 19:40 UTC (permalink / raw)
  To: Yuri Khan; +Cc: help-gnu-emacs

On 07.11.2020 15:26, Yuri Khan wrote:
> On Wed, 28 Oct 2020 at 03:56, Dmitry Gutov <dgutov@yandex.ru> wrote:
> 
>> Speaking of Xref, we could add some new commands: to remove items from
>> the list, to undo removals. And a stacking for searches, so you could go
>> back to the previous search result. Not sure how much that will help.
> 
> A stack/history would be very nice, yeah. To the point that I actually
> caught myself trying to press ‘l’ in xref and grep result buffers.

This shouldn't be too different to implement either. You can start with 
a bug report, and then we'll see who has the time to work on it first.

>> I don't have a solution, personally, and I usually work in a dynamic
>> language where this isn't a very feasible thing to do.
> 
> “We know that it has no solution, too. But we wish to learn how to solve it.”

;-)

> I’m working in Python, which is a dynamic language. Theoretically, in
> Python one does not have to declare argument and return types, and the
> call graph can change its structure at run time because functions are
> first-class values and because of class-based polymorphism. In
> practice, my project is mostly type-annotated (with mypy
> sanity-checking the annotations), and the use of polymorphism and
> dynamic binding is limited.

Python still has it easier than Ruby, because at least you have to 
explicitly import all packages and/or used functions in a given file. 
That means you can generally be sure where any function and class 
definition is from.

Of course, heavily dynamic code throws a wrench into this, but it's 
usually in the minority.

>> * https://github.com/beacoder/call-graph uses GNU Global.
>> * Here's a recipe for a graphical call graph:
> 
> Yeah, there exist many tools that attempt to take in the whole project
> and generate a complete call graph. In my experience, most of the
> time, for any project more complex than Hello World, the resulting
> graph is too messy to be helpful.

The call-graph features that looked relevant to me are how (looking at 
the list of commands and the gifs) you manually choose which nodes in 
the tree to expand further (which is necessary, since otherwise the tree 
can become very wide/unmanageable), as well as remove elements from it 
interactively.

But yeah, in the current state it's definitely not ideal.

> Eric S. Raymond once wrote up[1] the difference between automatons
> (programs that attempt to solve the problem fully without human
> involvement) and judgment amplifiers (programs that help the human
> solve the problem by automating parts of the process). I think I’m
> looking for a tool in the latter category.

It's a more difficult category, too, I think.

One has to balance the necessary features for the task at hand, and 
flexibility (to enable some set of related tasks), and efficiency as 
well (to minimize the manual work anyway).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-10-27 11:38 Exploring a code base? Yuri Khan
                   ` (3 preceding siblings ...)
  2020-10-27 20:56 ` Dmitry Gutov
@ 2020-10-27 20:59 ` Perry Smith
  2020-10-27 22:53 ` Daniel Martín
  2020-10-28  0:59 ` Skip Montanaro
  6 siblings, 0 replies; 19+ messages in thread
From: Perry Smith @ 2020-10-27 20:59 UTC (permalink / raw)
  To: Yuri Khan; +Cc: help-gnu-emacs

For C, cscope and a package that interfaces cscope to emacs is great.

> On Oct 27, 2020, at 6:38 AM, Yuri Khan <yuri.v.khan@gmail.com> wrote:
> 
> Hello list,
> 
> often, when working on a project, I encounter the following need:
> 
> * I want to refactor a data structure. It has a unique name, let’s say
> Foo, so I ‘M-x grep RET git grep Foo RET’. This gives me a Grep buffer
> where I can inspect each place where that type is used explicitly.
> 
> * I find that I have a function, let’s call it make_foo, that returns
> an instance of that type. There is also a consume_foo that accepts an
> argument of that type. I now want to inspect all usages of those
> because my refactoring may affect them. So I put point on make_foo and
> invoke ‘xref-find-references’.
> 
> * This leads to more functions that return Foo. I may want to inspect
> each of those recursively.
> 
> Basically what I’m doing is traversal of a graph, where nodes are type
> and function definitions, and edges are relationships such as
> “function <calls> function”, “function <accepts> type”, “function
> <returns> type”, “type <derives from> type”, “type <aggregates> type”,
> etc.
> 
> When the change I’m doing is not very invasive, the affected subgraph
> fits completely in my head. However, when it doesn’t, I find myself
> having to record my traversal state. I create an Org buffer and
> manually maintain a queue of nodes, marking those I haven’t yet
> visited with TODO and those I have with DONE. Then I pick the first
> TODO, grep or xref-find-references on it, add any relevant nodes to
> the queue, make the necessary changes in the code, and mark the node
> DONE. Repeat until no TODO.
> 
> This is rather tedious. It feels like there should exist a better way,
> maybe with a visualization of the graph structure.
> 
> What do you use to explore and map a code base and perform extensive
> changes on it?
> 




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-10-27 11:38 Exploring a code base? Yuri Khan
                   ` (4 preceding siblings ...)
  2020-10-27 20:59 ` Perry Smith
@ 2020-10-27 22:53 ` Daniel Martín
  2020-10-27 23:15   ` Stefan Monnier
  2020-10-28  0:59 ` Skip Montanaro
  6 siblings, 1 reply; 19+ messages in thread
From: Daniel Martín @ 2020-10-27 22:53 UTC (permalink / raw)
  To: Yuri Khan; +Cc: help-gnu-emacs

Yuri Khan <yuri.v.khan@gmail.com> writes:
>
> When the change I’m doing is not very invasive, the affected subgraph
> fits completely in my head. However, when it doesn’t, I find myself
> having to record my traversal state. I create an Org buffer and
> manually maintain a queue of nodes, marking those I haven’t yet
> visited with TODO and those I have with DONE. Then I pick the first
> TODO, grep or xref-find-references on it, add any relevant nodes to
> the queue, make the necessary changes in the code, and mark the node
> DONE. Repeat until no TODO.
>
> This is rather tedious. It feels like there should exist a better way,
> maybe with a visualization of the graph structure.
>
> What do you use to explore and map a code base and perform extensive
> changes on it?

It depends on the programming language, but I usually rely on a compiler
to generate an index for my project in the background.  An index is
similar to a TAGS file, except that it was generated by an actual
compiler and therefore all code relationships (child of, parent of,
referenced by, aggregate of, etc.) are stored in the index and are
usually very accurate.

There are many tools that can "visualize" code indexes.  For C/C++
languages, there is Sourcetrail
(https://github.com/CoatiSoftware/Sourcetrail), for example.  Clangd
(https://clangd.llvm.org), a language server for the C family of
languages, can also generate an index that you can use to perform simple
refactorings, like a rename, or to ask for the call graph of some
function.  Clangd is usable from Emacs via Eglot or lsp-mode.

This code index approach still has limitations, though, like how to make
sure that the index is completely up to date before a query, or how to
scale it to codebases of millions of lines of code, where creating an
index in a single machine is usually not possible.  But I think a
compiler-generated index may be a good trade-off over using grep +
keeping track of things manually.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-10-27 22:53 ` Daniel Martín
@ 2020-10-27 23:15   ` Stefan Monnier
  0 siblings, 0 replies; 19+ messages in thread
From: Stefan Monnier @ 2020-10-27 23:15 UTC (permalink / raw)
  To: help-gnu-emacs

> It depends on the programming language, but I usually rely on a compiler
> to generate an index for my project in the background.

IIUC Yuri's situation uses such an "index" but needs something more
because the "overall goal" requires changes in various pieces of code.
So while the index helps you find which pieces of code need to be
changed, it doesn't necessarily help you keep track of what still needs
to be done.

E.g.

- I change some function to use `syntax-ppss`.
- now some old function `foo1` is not needed any more, so I go look at
  all the functions called by `foo1` to see if they're still needed.
- along the way I see that `foo2`s third argument is now always nil,
  so I simplify `foo2's code and find that its second argument is not
  used any more, so I go look for all the callers to `foo2` so they
  don't other passing a second argument.
- while doing that I see that in `foo3` I can now do some other
  simplification which makes `foo4` into a dead function, so I start
  looking at all the functions call by `foo4` to see if they're
  still needed.
- along the way I realize that I'm late for a meeting with a student.
- next day I come back to this code and wonder where I was, which part
  of the fallout from the simplifications from `foo1`, `foo2`, `foo3`,
  and `foo4` has already been done, ...


        Stefan




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Exploring a code base?
  2020-10-27 11:38 Exploring a code base? Yuri Khan
                   ` (5 preceding siblings ...)
  2020-10-27 22:53 ` Daniel Martín
@ 2020-10-28  0:59 ` Skip Montanaro
  6 siblings, 0 replies; 19+ messages in thread
From: Skip Montanaro @ 2020-10-28  0:59 UTC (permalink / raw)
  To: Yuri Khan; +Cc: help-gnu-emacs

Yuri,

Have you explored the possible code refactoring tools available within
Emacs? It's quite possible such tools miss the mark you're aiming at, but
perhaps they can be extended.

Skip Montanaro


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-11-07 19:40 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-10-27 11:38 Exploring a code base? Yuri Khan
2020-10-27 11:58 ` Christopher Dimech
2020-10-27 14:15 ` Stefan Monnier
2020-10-27 15:55 ` Drew Adams
2020-10-27 20:56 ` Dmitry Gutov
2020-11-07 13:26   ` Yuri Khan
2020-11-07 13:56     ` Eli Zaretskii
2020-11-07 14:33       ` Gregory Heytings via Users list for the GNU Emacs text editor
2020-11-07 14:47         ` Eli Zaretskii
2020-11-07 15:32           ` Gregory Heytings via Users list for the GNU Emacs text editor
2020-11-07 15:52             ` Stefan Monnier
2020-11-07 15:58               ` Eli Zaretskii
2020-11-07 17:24                 ` Eric Abrahamsen
2020-11-07 19:23               ` Dmitry Gutov
2020-11-07 19:40     ` Dmitry Gutov
2020-10-27 20:59 ` Perry Smith
2020-10-27 22:53 ` Daniel Martín
2020-10-27 23:15   ` Stefan Monnier
2020-10-28  0:59 ` Skip Montanaro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).