Skip to content
This repository has been archived by the owner on Feb 10, 2023. It is now read-only.

Enhancements to athena-cli #36

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open

Conversation

bryanck
Copy link

@bryanck bryanck commented Sep 2, 2018

Hi, we are using athena-cli across our company, and made a few enhancements based on feedback from users. We'd love to get your feedback on these changes and hopefully merge them in.

Some changes include:

  • Support different output formats in shell mode
  • Fixes to character set handling
  • Iterate over results where possible to reduce memory use
  • allow lower case commands to be multiline
  • added an option to truncate the width of the aligned format and default to line wrap
  • a couple of minor bug fixes

athena_cli.py Outdated
row_count = len(results['ResultSet']['Rows'])
if headers and len(results['ResultSet']['Rows']) and results['ResultSet']['Rows'][0]['Data'][0].get('VarCharValue', None) == headers[0]:
row_count -= 1 # don't count header
#tabulate() only supports ascii
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true for Python 3?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to not be an issue in Python 3, I refactored this a little though and removed the comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest athena CLI completely drop support for Python 2.7 because cmd2, it's major dependency, has dropped support as of the end of last month. https://github.com/python-cmd2/cmd2#python-27-support-is-eol

So the charset problem might be resolved by just reverting the earlier hack. See #34

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using Python 3, so it's fine by us. I'll update the PR with some encoding cleanup and put in a python 3 shebang

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and made the changes, let me know what you think.

athena_cli.py Outdated
@@ -77,7 +93,7 @@ def execute(self, statement):
time.sleep(0.2) # 200ms

if status == 'SUCCEEDED':
output_results(self.athena, self.format, execution_id, sys.stdout)
output_results(self.athena, self.format, execution_id, sys.stdout, False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For boolean parameters I would include the paramater name eg ..., is_shell=False).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed. Done.

athena_cli.py Outdated
@@ -223,7 +242,7 @@ def default(self, line):

if status == 'SUCCEEDED':
process = subprocess.Popen(self.pager, stdin=subprocess.PIPE)
row_count = output_results(self.athena, self.format, self.execution_id, process.stdin)
row_count = output_results(self.athena, self.format, self.execution_id, process.stdin, True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ibid.

Copy link
Contributor

@satterly satterly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I don't work for the Guardian anymore but I'll see if I can get someone to merge this (with a few minor changes).

@bryanck
Copy link
Author

bryanck commented Sep 4, 2018

Cool, thanks for the feedback and consideration

athena_cli.py Outdated
@@ -1,3 +1,4 @@
#!/usr/bin/env python3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of adding a shebang here you should add "python_requires" option to setup.py. See https://packaging.python.org/guides/distributing-packages-using-setuptools/#python-requires

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'll check that in in a few

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set the python version requirement to 3.5 and also set a cmd2 version requirement

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be removed. It's ignored anyway as this is a python console script not a shell script.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I removed it

athena_cli.py Outdated
self.athena = athena
self.dbname = db
self.format = format
self.format = 'CSV' if format is None else format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more idiomatic way of writing this is ...

self.format = format or 'CSV'

However, why did you remove the default format from the method signature?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default wasn't being used anymore, it is always set, so it felt redundant having the default

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code as you suggested...

athena_cli.py Outdated

class AthenaShell(cmd.Cmd, object):

multilineCommands = ['WITH', 'SELECT', 'ALTER', 'CREATE', 'DESCRIBE', 'DROP', 'MSCK', 'SHOW', 'USE', 'VALUES']
multiline_commands = ['WITH', 'SELECT', 'ALTER', 'CREATE', 'DESCRIBE', 'DROP', 'MSCK', 'SHOW', 'USE', 'VALUES', 'with', 'select', 'alter', 'create', 'describe', 'drop', 'msck', 'show', 'use', 'values']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it necessary to repeat the list of valid commands as both upper- and lowercase?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cmd2 seems to be case-sensitive with this feature, i.e. without the lower case commands in the list, then lower case statements aren't multiline

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be rewritten as ...

    multiline_commands = ['WITH', 'SELECT', 'ALTER', 'CREATE', 'DESCRIBE', 'DROP', 'MSCK', 'SHOW', 'USE', 'VALUES']
    multiline_commands.extend([c.lower() for c in multiline_commands])

... to avoid the possibility of missing the fact you need to add the command twice if new commands are added.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's better. Checked in.

@@ -134,6 +158,7 @@ def do_help(self, arg):
help_output = """
Supported commands:
QUIT
EXIT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do EXIT and QUIT do exactly the same thing?

Copy link
Author

@bryanck bryanck Sep 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, "exit" is supported by other CLIs (like the presto-cli) so some users liked to have it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

athena_cli.py Outdated
if first_row == headers:
next(rows)

return (headers, rows)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This return has made the code after this try/exception block unreachable.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. This should be fixed now

@@ -30,5 +30,6 @@
keywords='aws athena presto cli',
classifiers=[
'Topic :: Utilities'
]
],
python_requires='>=3.5'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are failing because they are being run against Python 2.7 and 3.4 but this python_requires restricts installation to Python 3.5+.

You need to update the .travis.yml file like so ...

python:
  - "3.5"
  - "3.6"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked in a fix

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants