Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor API for data upload #44

Open
snekiam opened this issue Feb 13, 2020 · 3 comments
Open

Refactor API for data upload #44

snekiam opened this issue Feb 13, 2020 · 3 comments
Labels
enhancement New feature or request

Comments

@snekiam
Copy link
Member

snekiam commented Feb 13, 2020

Right now we hit the /new_data/wakeword for both wake word and non wake word recordings - this is not ideal since the URL is not intuitive. Should either be broken up into multiple endpoints (probably better), or renamed to something like audiodata.

@snekiam snekiam added the enhancement New feature or request label Feb 13, 2020
@mfekadu
Copy link
Member

mfekadu commented Feb 15, 2020

Thanks for the input @snekiam !

So far the plan was for these 2 endpoints that will serve the purpose of uploading new data to the nimbus database

  1. nimbus.com/new_data/wakeword - for the metadata of the wake word audio sample
  2. nimbus.com/new_data/phrases- for the question/answer data collection

I am open to learning and discussing better ways to organize the endpoints because, I agree, it is important for our API to feel intuitive.

Let's consider how other software engineers in our industry have solved this similar problem.

For example, SoundCloud has an API reference for uploading tracks (very similar to us)

The Python client for their API would look like this (the body of the request could include a lot more metadata):

import soundcloud

# create a client object with access token
client = soundcloud.Client(access_token='YOUR_ACCESS_TOKEN')

# upload audio file
track = client.post('/tracks', track={
    'title': 'This is my sound',
    'asset_data': open('file.mp3', 'rb')
})

# print track link
print track.permalink_url

We can probably assume a Track is an object in their database.

By this logic, perhaps you are right that we should create an endpoint to match the database entity:

class AudioSampleMetaData(Base):
__tablename__ = 'AudioSampleMetaData'
id = Column(Integer, primary_key=True)

What are your thoughts on the evidence above @snekiam ?

@snekiam
Copy link
Member Author

snekiam commented Feb 16, 2020

Thanks for walking me through your thought process - it makes more sense when I think about it paralleled with uploading phrase data.

My thought process was that we are hitting an endpoint called 'wakeword' with things which are not wakewords - I'm not against having just one endpoint (maybe called something like /new_data/audiosample) or even just using wakeword, since it is data for the wakeword model. I guess the existing name made me think of having two endpoints, but one would work (and really, wakeword isn't the worst name for it). I'd be okay with leaving it as /new_data/wakeword if other people feel strongly about it. We can talk about this more on Monday if you'd like - my vote would be to rename it something more generic like /new_data/audiodata or /new_data/audiosample.

@mfekadu
Copy link
Member

mfekadu commented Feb 17, 2020

agreed. lets do it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants