Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

visualization of the chemical mechanism #2

Open
RolfSander opened this issue Oct 1, 2021 · 12 comments
Open

visualization of the chemical mechanism #2

RolfSander opened this issue Oct 1, 2021 · 12 comments
Assignees
Labels
feature New feature or request future development Items that will be worked on in the future tools Ancillary tools for KPP (scripts, visualization, etc)
Milestone

Comments

@RolfSander
Copy link
Contributor

Automatic visualization of the chemical mechanism with graphviz.

@RolfSander RolfSander added the feature New feature or request label Oct 1, 2021
@RolfSander RolfSander self-assigned this Oct 1, 2021
@yantosca yantosca added the tools Ancillary tools for KPP (scripts, visualization, etc) label Oct 13, 2021
@RolfSander RolfSander added this to the 3.0.0 milestone Apr 23, 2022
yantosca added a commit that referenced this issue Apr 27, 2022
In the prior commit, we had removed the old code but did not commit
the new code to zero the C array.  Now fixed.

Signed-off-by: Bob Yantosca <[email protected]>
@jimmielin
Copy link
Member

I just wanted to confirm this is still under the 3.0.0 target - are there any blockers for implementing this within KPP, besides requiring atomic composition to be specified for each species? This could be a nice feature and a demo could be included in the KPP documentation as well. Thanks!

@RolfSander RolfSander modified the milestones: 3.0.0, 4.0.0 Jun 22, 2022
@RolfSander
Copy link
Contributor Author

The atomic composition is indeed the only reason why I haven't started
with the visualization yet. Plotting the whole mechanism makes no sense,
you wouldn't be able to see anything in that plot. Currently, my
visualization code is very MECCA-specific. It makes plots for several
subsets of the reaction mechanism, for example one plot for bromine
chemistry and one plot for chlorine chemistry. Even if I start now, it
would take some time to make the code independent of MECCA. I have moved
the milestone to 4.0.0 now.

@obin1
Copy link
Member

obin1 commented Mar 30, 2023

Automatic visualization of the chemical mechanism with graphviz.

I was inspired by this idea so here's a step towards graphviz compatibility: obin1@37ef030. This addition creates a DOT language file of the mechanism, which once removing the automatically generated header can be used with graphviz like dot -Tpng small_strato_SpeciesReactionGraph.gv > small_strato.png to visualize the mechanism as a bipartite species-reaction graph (see below example for KPP's small_strato). I've used the biadjacency matrix to represent the same type of graph in other work, which is also a format this fork creates, but I think DOT might be a more widely used format for network visualization.

Explicitly including atomic composition could be useful, but I'm not sure it'd be essential for this, isn't stoichiometry already (hopefully) implied in the .eqn file? If we only want to plot a subset/induced subgraph of the reaction mechanism, maybe the specific submechanism could be chosen when writing the files by some sort of mask (could existing tools, like families, be expanded to make this possible)? I know that @emyli19 is working on Python tools to visualize subsets of mechanisms, but this might also be a useful tool directly in KPP. Would people still be interested in something like this?

small_strato

@RolfSander
Copy link
Contributor Author

Hello @obin1, it's great to see your interest in graphs and mechanism
visualization! I think there are many aspects that we can discuss, so
I've tried to sort them somewhat...

  1. Graph type:

In my graphs, reactions are always represented as edges, whereas you are
generating bipartite species-reaction graphs (with reactions as nodes).
I think that both types have their own pros and cons for our purposes.
It's good to have code for both!

  1. Technical approach:

You have implemented your additions directly into the KPP C code. My
code is independent of the KPP program but it reads the KPP *.spc and
*.eqn input files. I think I will stick to my approach because it allows
me to code in Python (which I'm more familiar with) instead of C.

  1. Software:

Indeed, graphviz (dot) is probably the most widely used format for
network visualizations. I used to create my own dotfiles with awk, but
now I have switched to a python module called graph-tool
(https://graph-tool.skewed.de). It can use graphviz under the hood, and
in addition it has a large number of graph-theory related tools. For
example, it can find the most important chemical pathways from A to B
via an Edmonds-Karp algorithm.

  1. Creation of submechanisms (induced subgraphs):

With ever-increasing complexity of atmospheric chemical mechanisms, I
think that generating submechanisms will become more and more important.
In my code, I can choose between different criteria to define a
submechanism, e.g., based on elements, number of carbon atoms, or
picking all species involved in the reaction sequence from A to B.
Plotting a family as one node instead of showing all family members
individually is also a good idea which could be worth implementing.

@obin1
Copy link
Member

obin1 commented Apr 7, 2023

Thanks for binning these topics @RolfSander, I took a while to gather some thoughts, here they are

  1. Graph type
    I agree there are uses for both bipartite and unipartite graphs. However, it's possible to project a bipartite to a unipartite graph, but not always the other way around, so it might be best to start with a bipartite graph. I've found in some recent work that unipartite graphs are good for analyzing overall species-species relationships, but lose insight on reactions unless using a multigraph with an edge for each reaction. This can get messy as unipartite graphs can also split the same reaction into multiple edges between different pairs of species (unless using some sort of hypergraph approach for edges that connect all involved species: at that point, why not just work in the bipartite space?)

  2. Technical approach
    Not just you -- I know a good amount of people who have written their own parsers for .eqn and .spc files, but it seems like reinventing the wheel especially as .eqn exists as input for an existing parsing tool. I thought it might useful for future users to have this graph compatibility (for mechanism visualization but also other graph applications) directly built into a future version of KPP. What are your thoughts? I'd be happy to contribute to this if you think this would be a worthwhile feature.

  3. Software
    Thanks for the recommend. We recently moved from igraph to networkx, but will check out graph-tool.

  4. I like these ideas! Tagging @emyli19 who has been developing some submechanism visualization tools in Python, some of these input options might be good to include as keyword arguments at some point

@RolfSander
Copy link
Contributor Author

Thanks for your comments. A few replies:

1 Graph type

I agree that the bipartite graph is a cleaner way to store all the
important information. However, for the visualization I prefer
unipartite graphs. Chemists expect to see reactions as arrows (i.e.,
directed edges) and not as nodes. A suitable approach for us could be to
create the bipartite graph as the master file, and then convert to
unipartite whenever needed.

2 Technical approach

I don't think that KPP is the right tool to perform any complex
graph-related operations. However, it would indeed be a very useful new
feature if KPP is able to create a graph that contains the full reaction
mechanism from the *.eqn file. A suitable way to save the graph could be
the XML-based GraphML format:

https://en.wikipedia.org/wiki/GraphML

http://graphml.graphdrawing.org/

Note that both networkx and graph-tool are able to read and write in
GraphML format.

3 Software

I quickly checked the wikipedia page of networkx. It looks like a very
nice tool. However, with growing chemical mechanisms, the speed of
graph-tool compared to networkx could become important:

https://graph-tool.skewed.de/performance

@obin1
Copy link
Member

obin1 commented Apr 11, 2023

To follow up on these:

  1. That sounds good, visualization is more intuitive (and potentially less messy) as a unipartite graph, which can be generated from the bipartite graph.
  2. I agree that it doesn't make sense to make KPP a network analysis library; there are already several good tools out there. But it is quite straightforward to build into KPP some graph preprocessing functionality while we parse the chemical mechanism. I wrote this initial example for the DOT format, which I found easier to code up in C, but I think both DOT and GraphML are readable and writable by both networkx and graph-tool. My next step is to include reciprocal reactions in the generated DOT format, which are currently left out of the biadjacency matrix but essential for other applications.
  3. Good tip -- I might move to graph-tool for some applications that need higher performance.

@RolfSander
Copy link
Contributor Author

  1. OK. I think we can tick off this point. Let's create a bipartite
    master file.

  2. It seems that GraphML is more powerful than DOT. Apparently, edge and
    node properties can only be strings in DOT:

    https://graph-tool.skewed.de/static/doc/quickstart.html#graph-i-o

    This would be a severe limitation when I want to add the elemental
    composition of the species as python dictionaries. Therefore, I
    prefer GraphML. However, it would be a waste of code not to use the
    DOT output that you have already written. The solution for us could
    be to create a new KPP command that eventually will offer both
    options, e.g.:

    #GRAPH OFF (default)
    #GRAPH DOT
    #GRAPH GRAPHML
    #GRAPH ALL

    I found a nice document that describes the different formats for graphs:

    https://intranet.icar.cnr.it/wp-content/uploads/2018/12/RT-ICAR-PA-2018-06.pdf

@RolfSander
Copy link
Contributor Author

Hello @obin1 and @emyli19,

A manuscript describing my mechanism explorer software is now open for
discussion:

https://doi.org/10.5194/egusphere-2023-1577

If you have any comments or suggestions, feel free to post a public
comment there.

My code takes KPP *.spc and *.eqn files as input and generates a
unipartite graph of the mechanism. If you are still interested in
creating a bipartite graph directly via KPP, I'd be more than happy to
add a new function that can read your graph into my MEXPLORER software.

@obin1
Copy link
Member

obin1 commented Aug 9, 2023

Hi @RolfSander, we were actually just looking at this last week! Looks like a really useful tool, especially the interactive visualization.

Our projects are still using bipartite graphs, either in DOT format for reciprocal reactions or the biadjacency matrix for mass balancing in ML applications. I am using KPP to generate these formulations, so we are definitely interested in use of MEXPLORER. I would first like to clean up the way this is done, including adding the #GRAPH toggle you mentioned. I consider this motivation to clean up the features I've added to KPP :)

Section 2.3.2 is quite relevant to what we are working on: @emyli is quantifying cycles in GEOS-Chem in a bipartite context. I am curious how this is done in a unipartite context in MEXPLORER -- are parallel edges merged for calculation of the net reaction between species?

@RolfSander
Copy link
Contributor Author

Hello @obin1 and @emyli19,

we were actually just looking at this last week! Looks like a really
useful tool, especially the interactive visualization.

Thanks :-)

Let me know if you want to try it and have any questions about the
installation or usage!

I would first like to clean up the way this is done, including adding
the #GRAPH toggle you mentioned. I consider this motivation to clean
up the features I've added to KPP :)

Great! I think the best way to proceed would be to create a separate
branch for you in the KPP repo where you can develop and test your code.
If the default setting (#GRAPH OFF) has no side effects for other KPP
users, we'd be happy to include the new feature in the main KPP
distribution.

Section 2.3.2 is quite relevant to what we are working on: @emyli is
quantifying cycles in GEOS-Chem in a bipartite context. I am curious how
this is done in a unipartite context in MEXPLORER -- are parallel edges
merged for calculation of the net reaction between species?

For the visualization, I merge parallel edges that go in the same
direction. For the analysis, I'm using a different approach: I create a
temporary copy of the graph in which I delete all edges except those for
the very fast reactions. This causes the graph to fall apart into
several strongly connected components which are detected by graph-tool:

https://graph-tool.skewed.de/static/doc/autosummary/graph_tool.topology.label_components.html

The subgraphs with two or more vertices indicate which species belong to
the fast chemical cycles.

@obin1
Copy link
Member

obin1 commented Aug 15, 2023

Great, I started a new branch! I was working in one of the flux/family branches, but will move the code to the new branch to isolate the graph generation.

@yantosca yantosca added the future development Items that will be worked on in the future label Jun 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request future development Items that will be worked on in the future tools Ancillary tools for KPP (scripts, visualization, etc)
Projects
None yet
Development

No branches or pull requests

4 participants