-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Troubleshooting exercise - Consul #66
Comments
. |
add logging to
After adding this, i tried to run consul via the following command: But I received a insufficient permission error for To resolve this, i have done the following:
After these commands consul started and I could see logs being created |
After we decided to validate the consul configuration file:
after some research i set it to 5, which is the default config. This is a timeout multiplier so it doesn't matter that much (it doesn't relate to number of peers in any way.) |
Run consul validate again
Here we updated the bootstrap_expect to 3 since we will have 3 consul servers. There's also a warning for deprecation, but we decided to leave it for now |
When comparing consul-config.json files we noticed that one of them had default_policy = allow.
After making these changes, we ran consul agent on all instances
Consul could successfully be started and we could monitor logs via |
At this point, all of the services were started but when we ran We tried to troubleshoot this via different methods: Mistake 1: Change to use public IPs.So the IPs looked weird and we (or I, to be correct) decided to try the public IPs for all instances. This led us into a zone where we were trying different things/combinations/researching ACLs and such but this was all pointless. Even went as far as to check the security group in AWS in the hopes of finding what was causing the issue. After a few hours, we realised that we should switch back to private IPs which we did. This didn't resolve anything at first, but we found out that |
After this we updated the
We also added the following 2 options to make sure
After setting all of these and restarting instances (
Now that the servers were sorted, we started working on clients. |
Adding logs as before and then trying to run the clients. The Added it: “token”: “a15f4b82-4d0f-1a91-4a5b-8b27285dc13d” |
On one of the servers now check The problem we had is that consul members showed 2 clients. But Client 1 appeared with the status failed and Client 2 with the status alive. If we ran @berkeli do you have anything else to add to what we tried before registering services? @RitaGlushkova this was the weird part, even now
I think the main problem was that when we ran |
Unfortunately due to rebooting and switching terminals, we lost some of the exact logs. Looking at the documentation we have the following:
We decided to run this command paired with service.json file as this is our service destination: After multiple reboots and finally adding @berkeli please correct me if I am forgetting something. Yes this is correct. Once we had clients listed as members and alive on To register clients as services, we used the command This was successfully registering services, but when we ran I thought it might be due to the ACL token again and tried running the following
@RitaGlushkova I have looked into the ACL tokens this morning and there's a better way of doing it so we don't need to provide it on every command:
After running these 2 commandds, we can run consul catalog services (and other commands) without supplying the token flag
|
SUMMARYServers:
Clients:
Logging on all nodes:
|
Thank you @berkeli I also did read about tokens last night, but because my IP address changed, couldn't test it. Thank you for trying it and happy it works! The summary looks good. |
Is there a reason you always started consul from commandline with |
yes when I ran it with systemctl it was giving an error
Also similar error when running I think it might have been because we didn't run it with sudo, and if it worked we wouldn't have to reboot as often as well |
you don't need sudo, but it should be consul.service instead of consul like mentioned in the error message.
|
Ok, overall great work on getting the expected result :) Couple of things to know and think about:
This way you didn't really need to start consul agent by hand. you edit/change the config and just run
The only reason this exercise worked because you started the agent by hand, see 2 processes?
I would call this a
This seems like a config option that |
https://docs.google.com/document/d/1V6HEu_OcJ3MHH-aHzUfANf06VJa1rPcGHcpBwql7QLA/edit#
Troubleshoot consul as per the doc
The text was updated successfully, but these errors were encountered: