Reproducibility of RMG #275

keceli · 2012-11-06T05:03:07Z

Below is a sample RMG input file which you can observe that RMG produces different outputs (Final_Model.txt) even on the same machine.

Database: RMG_database

MaxCarbonNumberPerSpecies: 10
MaxOxygenNumberPerSpecies: 0
MaxRadicalNumberPerSpecies: 2

PrimaryThermoLibrary:
END

PrimaryTransportLibrary:
END

ForbiddenStructures:
END

ReadRestart: no
WriteRestart: yes

TemperatureModel: Constant (K) 1250
PressureModel: Constant (atm) 1

InitialStatus:

JP10 (mol/cm3) 1.0
1 C 0 {2,S} {9,S} {10,S}
2 C 0 {1,S} {3,S} {6,S}
3 C 0 {2,S} {4,S}
4 C 0 {3,S} {5,S}
5 C 0 {4,S} {6,S}
6 C 0 {5,S} {7,S} {2,S}
7 C 0 {6,S} {8,S} {10,S}
8 C 0 {7,S} {9,S}
9 C 0 {8,S} {1,S}
10 C 0 {1,S} {7,S}

END

InertGas:
N2 (mol/cm3) 52.67
END

SpectroscopicDataEstimator: off
PressureDependence: off

FinishController:
(1) Goal Conversion: JP10 0.5
(2) Error Tolerance: 0.5

DynamicSimulator: DASSL
Conversions: AUTO
Atol: 1e-18
Rtol: 1e-8

PrimaryKineticLibrary:
END

ReactionLibrary:
END

SeedMechanism:
END

ChemkinUnits:
A: moles
Ea: kcal/mol

This is a very simple input file (no Pdep, QMTP etc) and runs under a few minutes. The final model differs in the kinetics for the first reaction, you can get one of the below:

1. SPC(10)=JP10(1)      7.601e+11         0.07    1.96
   SPC(10)=JP10(1) 7.760e+09         0.31    1.70

1. SPC(10)=JP10(1)      2.328e+10         0.31    1.70
   SPC(10)=JP10(1) 2.534e+11         0.07    1.96

1. SPC(10)=JP10(1)      1.552e+10         0.31    1.70
   SPC(10)=JP10(1) 5.068e+11         0.07    1.96

Tracing back to the root shows that species created by reverse Birad_recombination (Ring_Open) can differ in different runs. Discussions with Josh, Nick, Nathan and Beat all ended up that this is probably related with HashMap's or HashSet's in Java which causes the orders to differ in each run depending on memory allocation.

Replacing all HashMap and HashSet collections with LinkedHashMap and LinkedHashSet in all Java source files seems to be the easy fix for the problem. I have checked some test cases and didn't see any side effect. While Linked collections are known to be slow when you add or remove an element, they are faster in iterations, so I hope that performance of RMG will not be affected much. (http://www.coderanch.com/t/426744/java-programmer-SCJP/certification/Speed-LinkedHashSet-vs-HashSet)

I added a branch named Linked with those changes.

The text was updated successfully, but these errors were encountered:

rwest · 2012-11-06T05:38:06Z

It looks like these aren't just reactions appearing in a different order - they actually have different A factors. Can you explain that?

One of them must be more erroneous than the other, which makes me nervous about arbitarily picking one based on a linked hash map's order. What if we're consistently wrong from one run to the next?

mrharper · 2012-11-06T10:35:25Z

SPC(10)=JP10(1) 7.601e+11 0.07 1.96
SPC(10)=JP10(1) 7.760e+09 0.31 1.70
SPC(10)=JP10(1) 2.328e+10 0.31 1.70
SPC(10)=JP10(1) 2.534e+11 0.07 1.96
SPC(10)=JP10(1) 1.552e+10 0.31 1.70
SPC(10)=JP10(1) 5.068e+11 0.07 1.96

The A factors are multiples of each other, so it appears the degeneracy is different for each run (?)
Looking at the kinetics with n=0.31 and Ea=1.70
The third example is 2x the first; the second is 3x the first
For the kinetics with n=0.07 and Ea=1.96
The third example is 2x the second; the first is 3x the second

rwest · 2012-11-06T19:20:05Z

Thanks Mike. So in all three cases the total is 4 pathways: split either 1 and 3, 3 and 1, or 2 and 2.

@keceli, do you have the verbose comments from these rates, to see if that gives further clues as to what's going on?

This gets various corrections and improvements to the QM Library work. I think this includes all the changes by Nate up to nyee/RMG-Java@c10379a excluding the HashMap / LinkedHashMap change (see ReactionMechanismGenerator#275) Also worth noting - now using the augmented (modified) InChI in the QM library. I suggest you delete any QM libraries you have made thus far, unless you're sure there were no radicals in them.

keceli · 2012-11-26T22:02:04Z

I understand that it is hard to merge the global change of HashMap vs LinkedHashMap before other changes, but so far it is the only way to force RMG to have reproducible output. Is there a plan to merge it at the end, or is there any reason to avoid it?

rwest · 2012-11-26T22:07:30Z

It is the only way we have found so far to force reproducible output, because we do not yet understand why the different pathways above (with different rates) have different degeneracies depending on the whim of a HashSet.
They can't all be right, so I feel there must be a bug. Converting everything to LinkedHashSets will hide, but not fix, the bug. That is my reason to avoid it. Can you help us understand and fix the bug? What do others think?

keceli · 2012-11-27T05:47:36Z

I wish I had a better understanding of this bug, however, I still think leaving RMG with HashMaps is more dangerous since it is the source of the inconsistent outputs of RMG. The linked version picks the 3rd set of the rate constants listed here and I don't know how to judge if it is the right value, or not. I think with a reproducible output, it would be easier to debug the code.

jwallen · 2012-12-07T16:45:29Z

Okay, so it looks like the reaction in question is the homolytic bond fission of one of the bridging C-C bonds. The Birad_recombination family is defined in the bond forming reaction, and in that direction there is only one equivalent bond being formed, so I would expect the degeneracy to be one, not four.

The first problem is that, as a Birad_recombination, the two atoms in the forming/breaking bond can be labeled as (_1, *2) or as (_2, *1). The ensuing degeneracy should be divided by two to account for this (as it is in the R_recombination family), but I don't see that happening anywhere in the code. Thus I think we are double-counting all of our Birad_recombination reactions. A test of RMG with cyclohexane seems to confirm this. (A degeneracy of one or three would then have to be an error in this case.)

The second issue is one of multiple transition states. Since JP-10 is a fused ring system, there are both five-membered ring and six-membered ring transition states that are matched by the template, hence the duplicate reactions in the Chemkin output. Physically there is only one transition state, so I think we probably should not keep both kinetics in this case, but which of the two is the one to keep is not at all obvious.

I am still not entirely sure why RMG sometimes arrives at degeneracies of one and three (or three and one), instead of two and two. Although using linked hash maps seems to give the desired result in this case, I think that might just be luck.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducibility of RMG #275

Reproducibility of RMG #275

keceli commented Nov 6, 2012

rwest commented Nov 6, 2012

mrharper commented Nov 6, 2012

rwest commented Nov 6, 2012

keceli commented Nov 26, 2012

rwest commented Nov 26, 2012

keceli commented Nov 27, 2012

jwallen commented Dec 7, 2012

Reproducibility of RMG #275

Reproducibility of RMG #275

Comments

keceli commented Nov 6, 2012

rwest commented Nov 6, 2012

mrharper commented Nov 6, 2012

rwest commented Nov 6, 2012

keceli commented Nov 26, 2012

rwest commented Nov 26, 2012

keceli commented Nov 27, 2012

jwallen commented Dec 7, 2012