Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 'lago ovirt stop/start' #485

Merged
merged 1 commit into from
Mar 20, 2017

Conversation

nvgoldin
Copy link
Contributor

Fixes: #463

Per what it is supposed to do - it seems to be working.
I tested it against OST 'basic_suite_master', after triggering lago ovirt start the iSCSI storage domain failed to start. On a second round(lago ovirt stop && lago ovirt start), one of the hosts became none-operational and was unable to connect to the storage domain(other host was fine, and the iSCSI domain came back to life too).

@@ -300,18 +321,12 @@ def _host_is_up():
)
elif host_obj.status == sdk4.types.HostStatus.INSTALL_FAILED:
raise RuntimeError('Host %s installation failed' % h.name)
elif host_obj.status == sdk4.types.HostStatus.NON_RESPONSIVE:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this check ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it failed the run, but after a few seconds the host reported 'UP' in the engine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe add "allowed_exception" in "testlib.assert_true_within" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not an exception :/ For the good and bad:

In [3]: from ovirtsdk4 import types

In [4]: type(types.HostStatus)
Out[4]: enum.EnumMeta

In [5]: type(types.HostStatus.NON_RESPONSIVE)
Out[5]: <enum 'HostStatus'>

There is some refactoring we can do here, with #483 or without, but I think we can do it later.

@nvgoldin
Copy link
Contributor Author

Erhm, while testing this I stumbled with this exception from the SDK:

> lago ovirt stop-vms
Error occured, aborting
Traceback (most recent call last):
  File "/home/ngoldin/src/nvgoldin.github.com/lago/ovirtlago/cmd.py", line 310, in do_run
    self.cli_plugins[args.ovirtverb].do_run(args)
  File "/home/ngoldin/src/nvgoldin.github.com/lago/lago/plugins/cli.py", line 184, in do_run
    self._do_run(**vars(args))
  File "/home/ngoldin/src/nvgoldin.github.com/lago/lago/utils.py", line 495, in wrapper
    return func(*args, **kwargs)
  File "/home/ngoldin/src/nvgoldin.github.com/lago/lago/utils.py", line 506, in wrapper
    return func(*args, prefix=prefix, **kwargs)
  File "/home/ngoldin/src/nvgoldin.github.com/lago/ovirtlago/cmd.py", line 197, in do_ovirt_stop_vms
    prefix.virt_env.engine_vm().stop_all_vms()
  File "/home/ngoldin/src/nvgoldin.github.com/lago/ovirtlago/virt.py", line 277, in stop_all_vms
    [vms_service.vm_service(id).stop() for id in ids]
  File "/home/ngoldin/virtualenv/lago-venv/lib/python2.7/site-packages/ovirtsdk4/services.py", line 30209, in stop
    self._check_action(response)
  File "/home/ngoldin/virtualenv/lago-venv/lib/python2.7/site-packages/ovirtsdk4/service.py", line 129, in _check_action
    Service._raise_error(response, result.fault)
  File "/home/ngoldin/virtualenv/lago-venv/lib/python2.7/site-packages/ovirtsdk4/service.py", line 71, in _raise_error
    raise Error(msg)
Error: Fault reason is "Operation Failed". Fault detail is "[Virtual machine destroy error]". HTTP response code is 400.

But the VM was actually stopped successfully on the engine side.

While looking at the Ansible module earlier, I saw:
https://github.com/ansible/ansible/blob/devel/lib/ansible/module_utils/ovirt.py#L271

def get_entity(service):
    """
    Ignore SDK Error in case of getting an entity from service.
    """
    entity = None
    try:
        entity = service.get()
    except sdk.Error:
        # We can get here 404, we should ignore it, in case
        # of removing entity for example.
        pass
    return entity

@machacekondra - Any idea if this is an expected behaviour?

Maybe the nested call is the issue? (vms_service.vm_service(id).stop() )

@machacekondra
Copy link
Member

This is some backend error, not Python SDK. Feel free to open an bug on it.

1. Add 'start_all_vms' method.
2. Add 'lago ovirt start/stop'
@nvgoldin
Copy link
Contributor Author

@machacekondra - I see, thanks.

I removed the start L2 VMs from lago ovirt start, as this is not fully baked yet: I suspect we need to first assert all storage domains are in active mode, and only then start the VMs.

@gbenhaim - this is ready on my side.

@gbenhaim
Copy link
Member

ci merge please

@ovirt-infra ovirt-infra merged commit 14f179f into lago-project:master Mar 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants