Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add capability to export item flow digraph adjacency matrix to CSV #90

Open
wizard1073 opened this issue Dec 2, 2021 · 7 comments

Comments

@wizard1073
Copy link

With very large digraphs, it can be too complex to read and analyze the structure visually through the web interface. Exporting the digraph of the item flow to CSV allows it to be imported into tools that automate the analysis. For example, the digraph of item flow can be treated as a Design Structure Matrix, a system engineering tool that uses topological sorting and grouping of related graph nodes. One tool in particular from MIT is a macro-enabled Excel spreadsheet that automates the analysis and partitioning of the DSM, and is an ideal candidate for importing the CSV version of the digraph from this tool.

[Thank you for developing such a great tool!]

@greeny
Copy link
Owner

greeny commented Dec 2, 2021

hi, how would you imagine such an export? 🤔 I have pretty much no experience in saving graph structure in CSV, so if you have any suggestions, I'm open to those.

Anyway, import/export for various formats is planned feature, so this can be very much part of that.

@wizard1073
Copy link
Author

I expect you have the adjacency data internally in an array, unless you are using a more compact adjacency list (linked list form). A CSV export of the adjacency matrix would have one row for each item in the graph. Making fuel from crude oil would look like this:

Oil,0,0,0
Fuel,1,0,0
Polymer Resin,1,0,0

The column names are unneeded because they are identical to the row names. There is a "1" any place Oil "causes" Fuel and Polymer Resin. I have suppressed the machinery, because they are the "physical architecture", while the item flow by itself is the "logical" or "behavioral" architecture which is the focus of analysis. A similar approach could be taken for the physical architecture (machines and splitters/mergers), but there would need to be multiple "layers" of the data to fully represent that complexity.

Since we have at most ~90 items, a full adjacency matrix would have ~8100 entries, most of which would be zeros. A sparse representation is smaller, because it eliminates all of the zeros, but must be converted to get into Excel. At most, I estimate a full CSV file with 90 items and ~5% non-zero entries should be less than 200kB.

@greeny
Copy link
Owner

greeny commented Dec 2, 2021

I currently have array of nodes with IDs and array of edges with [from - to] definition. Keep in mind that edge can go from A to B and from B to A at the same time (e.g. recycled plastic/rubber loop). I'm not sure if that can be put into a matrix.

Also, you shouldn't count items, but recipes. Each node is one recipe. We have ~150 recipes, so technically you can go over ~22k entries, which is quite a lot imo.

@wizard1073
Copy link
Author

If I understand correctly, you have the adjacency matrix of recipes, and you have a definition for [from - to], which can exported as is or transposed (both forms are valid). You can have entries in both the upper and lower triangle--that's an indication of a feedback loop, which can occur in production lines.

I would like to understand how to get the item flow data from your internal data representation. What gets drawn onscreen has machines as vertexes and flow rates of each item as edges. The vertexes have additional information: number of machines, machine type, and efficiency. When we suppress calculating mergers and splitters, the graph resolves to just calculated item flow. We only need to export the edge data from this simplified form of the graph.

By saying "each node is one recipe", does that mean you are storing the input rates and output rates of the recipe? Do you store the numbers after the desired production rates are calculated? It sounds like there would need to be a mapping process that converts from recipe-based vertexes to item vertexes. Each recipe has a different name, but some output names are just variants and map to the base name. Once the name is reduced to the base name, the adjacency matrix rows can be created for each recipe output using the same inputs for each row.

If I understand this correctly, if all recipes were utilized in one graph, then the adjacency matrix for the item flow would have ~22,500 entries, most of which are zero, so now the filesize is at most ~1MB. The MIT tool can handle up to 250 items (62,500 entries), so this is still feasible. This is also why we look at automation tools to handle such complex production setups!

@greeny
Copy link
Owner

greeny commented Dec 3, 2021

it's a bit complex on how the tool exactly produces the final visualisation. If you have discord, you can add me there and we can talk about it (greeny#4945). However I'll try to put basics here as well:

  • the tool uses an API to calculate result. It sends all the data to the API and the API returns a result that's just the nodes that you see on screen. For each node it returns what recipe it uses and how many buildings (or if it's a product/byproduct node -> which item and how many are produced).
  • the tool then connects those nodes basically by calculating how much each recipe produces and consumes based on the given amount of buildings and then distributes the items between all viable targets.
  • in the end we end up with nodes and edges that both have metadata attached to it. For nodes it's number of buildings or number of items produced, for edges it's how many items it carries.

@wizard1073
Copy link
Author

I will see if you are on Friday evening (US east coast time) and chat more then. Thank you!

@wizard1073
Copy link
Author

wizard1073 commented Dec 4, 2021

Went extensively through the code. Very clean code, practically self documenting!

It looks like the information stored in RecipeNode (src/Tools/Production/Result/Nodes/RecipeNode.ts) needs to be saved to a central data store so that a separate process (export adjacency matrix CSV) can parse the array, create the adjacency matrix from the ingredients field in each RecipeNode, and export it. If I understand correctly, this is the subset of recipes the solver selected from all possible/allowed recipes sent to the solver, multiplied by scaling factors (per recipe) to make the production chain meet the desired rates of the user-selected products. It also looks like this data currently only goes to the graph processor.

I'm up on discord now if you want to chat outside these comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants