Skip to content
This repository has been archived by the owner on Jun 10, 2024. It is now read-only.

Only a single XTDB directory per platform? #471

Open
dwolfson opened this issue Apr 10, 2023 · 14 comments
Open

Only a single XTDB directory per platform? #471

dwolfson opened this issue Apr 10, 2023 · 14 comments

Comments

@dwolfson
Copy link
Member

As I was investigating some locking errors that I received on startup of the lab jupyter notebooks, I noticed that there seems to be only one XTDB directory per platform. In the lab environment there are multiple metadata servers per platform. I can see in the console log that the first server comes up ok - but that the subsequent servers on the platform get a locking error from XTDB...I suspect that we need change this so that there is a separate XTDB directory per server per platform?

@dwolfson
Copy link
Member Author

I guess what it really means is that we have to create a separate xtdb config per server rather than try to share an xtdb config per platform - so the labs need to change.

@mandy-chessell
Copy link
Contributor

The labs have multiple metadata servers starting up on the same platform.

I had a quick look at the code and for an EDN config, the startup creates a temporary file called ./xtdb. It then stores the EDN config into that file and the file (actually a directory) is passed to XTDB.

This is the source file ...
https://github.com/odpi/egeria-connector-xtdb/blob/main/connector/src/main/java/org/odpi/egeria/connectors/juxt/xtdb/repositoryconnector/XtdbOMRSRepositoryConnector.java

This is the code that builds the config file...

if (configProperties.containsKey(XtdbOMRSRepositoryConnectorProvider.XTDB_CONFIG_EDN)) {
                // EDN-style configuration
                try {
                    configFile = File.createTempFile("xtdb", ".edn", new File("./"));
                    String xtdbCfg = (String) configProperties.get(XtdbOMRSRepositoryConnectorProvider.XTDB_CONFIG_EDN);
                    luceneConfigured = xtdbCfg.contains(Constants.XTDB_LUCENE);
                    BufferedWriter writer = new BufferedWriter(new FileWriter(configFile));
                    writer.write(xtdbCfg);
                    writer.close();
                } catch (IOException e) {
                    auditLog.logException(methodName, XtdbOMRSAuditCode.CANNOT_READ_CONFIGURATION.getMessageDefinition(e.getClass().getName()), e);
                    throw new ConnectorCheckedException(XtdbOMRSErrorCode.CANNOT_READ_CONFIGURATION.getMessageDefinition(repositoryName),
                            this.getClass().getName(), methodName, e);
                }
            }

I wonder if we change this line from:

configFile = File.createTempFile("xtdb", ".edn", new File("./"));

to

configFile = File.createTempFile("xtdb", ".edn", new File("./" + serverName + "/"));

then the config files for each server will be segregated into their own directory.

@cmgrote
Copy link
Member

cmgrote commented Apr 11, 2023

the startup creates a temporary file called ./xtdb. It then stores the EDN config into that file and the file (actually a directory) is passed to XTDB.

File.createTempFile() creates a random file using the input parameters as a prefix and suffix — it's not a directory. According to the docs, it is also guaranteed that "Neither this method nor any of its variants will return the same abstract pathname again in the current invocation of the virtual machine." (See: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/io/File.html#createTempFile(java.lang.String,java.lang.String,java.io.File))

I'm interpreting that to mean that the only way to get a conflict would be to have multiple JVMs that happen to create the same random temporary filename (though my guess would be that even there the likelihood of this is probably relatively small?)

Nonetheless, perhaps it makes sense to use the server name as the prefix, rather than the literal xtdb.

@mandy-chessell
Copy link
Contributor

My bad - "random" should be "temporary"

The problem is that we have two servers, each requiring their own repository, running in the same JVM. They try to use the same config and they intefere with one another.

@dwolfson
Copy link
Member Author

Would it also then be true that each metadata server needs its own Postgres database?

@refset
Copy link

refset commented Apr 11, 2023

@dwolfson in case it helps: the XTDB JDBC module implementation currently assumes a single hard-coded table name called tx_events, so for separate XTDB instances (i.e. different tx-log & doc-store) you would at least have to use separate schemas (e.g. as per this thread), if not separate logical databases or physical servers.

@dwolfson
Copy link
Member Author

@refset thanks for the suggestion. Need to think about what might make the most sense once we have a firm notion of how many metadata servers there would be. It probably does make sense to use different Postgres schemas for different metadata servers on the same Egeria platform..but need to think about it.

@planetf1
Copy link
Member

It appears to me that each egeria repository server should have some level of isolation in terms of it's repo - ie it should have it's 'own' xtdb repository (mostly think tx log & doc store). That could be unique postgres deployments, or schemas.

So this mostly seems to relate to configuration?

Where we have multiple replicas of the same server (ie 3 replicas of cocoMDS2 for load sharing/redundancy) then they should of course SHARE the repositories

On the temp file - it seems as if the code should be fine, though I'd be inclined to still put it in a server specific directory.. but actually there's more..... in a k8s environment we are trying to minimize persistence storage. So I do also assume that this file is only a temporary file - doesn't matter if it goes away when server is restarted

@dwolfson
Copy link
Member Author

@planetf1 with regards to an HA configuration, if you have multiple replicas of the same server for HA reasons then you most likely want to attach each server to a different database to enable database redundancy as well?

@planetf1
Copy link
Member

Each egeria server replica (ie instances of cocoMDS2 etc) has a common configuration, and will use the same back end persistent stores, indeed this is required as all should behave the same

One way of achieving this is simply to have a common 'service' for the postgres backend - and let it's implementation worry about failover/scaling -- for example with CrunchData's postgres operator it will automatically use the current master.

cmgrote added a commit to cmgrote/egeria-connector-xtdb that referenced this issue Apr 20, 2023
@cmgrote
Copy link
Member

cmgrote commented Apr 20, 2023

@dwolfson can you please re-test with the latest snapshot, once #474 is merged? This changes the temporary EDN-based config file to be server-specific.

I primarily want to be certain it doesn't break anything (I don't have a local EDN-based config and broader setup to quickly test against); but if you can also confirm whether it fixes the multi-server config conflict even better 😁

cmgrote added a commit that referenced this issue Apr 20, 2023
#471 creates separate temporary config file per server
@dwolfson
Copy link
Member Author

Thanks @cmgrote - I'll try to test tomorrow..

@dwolfson
Copy link
Member Author

@cmgrote - I configured the labs with the in-memory xtdb and there were no errors (tested against 4 notebooks). I looked in the pods to see where the XTDB files are - they don't seem to be in /deployments/data/servers/.. anymore? Where did they go?

I tried using Postgres for the xtdb library - but it gives a Lucene error - I think the same one that @planetf1 has been chasing..

@planetf1
Copy link
Member

@dwolfson You might want to retry the postgres/xtdb today. CTS test was successful last night.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants