-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Layered Model of Singer #19
Comments
@dmosorast thanks for putting this together. This would be worth putting into a blog post or formal document somewhere! I really like this framing and this articulates how we've thought about Singer overall. We've always taken the approach when working on the Meltano SDK that any tap and target written with it should always be able to fall back to the purest form of the spec and should work on the command line with the pipe operator sending data. We've also had a lot of discussions that anything that can flow "downward" to a lower layer absolutely should. So instead of keeping something specific to Meltano or an orchestration framework, if it's appropriate to put in the spec, then that's where it should go. I'd love for this to exist as a doc on either singer.io or some other neutral website! |
@dmosorast - I wonder what you think of something between "Spec" and "Standards and Best Practices" - where taps and targets can advertise certain capabilities and best practices they have implemented. The easy example is for a tap to advertise whether it can run in discovery mode ( I feel like there's a subset of best practice behaviors we want to promote for the Singer community, and then also importantly we want to let taps and targets declare that they adhere to them to that orchestrators and their paired tap or target can rely on that behavior. These could be declared in repo metadata, in the repo's What do you think of us breaking "Spec" into two tiers - "Required" and "Optional" behaviors - and then discuss further about declaring/detecting the optional behaviors in #8? And perhaps also, we as a Working Group may occasionally make proposals to promote "best practices" (recommended but not part of spec) to "optional capabilities" (part of spec, but not strictly required). Thoughts? |
We'll continue to iterate here, but I've added a "which layers" prompt to the SIP template: 4b5e835
I've left off "Libraries and Frameworks" (probably external to working group) as well as "Tooling/Orchestration/UX/Infrastructure". When it applies to community guidance and/or best practices overall, I think general documentation for both of these layers could be bucketed under "Singer documentation (Other)" or "Singer best practices and other guidance". |
That makes sense. I have some reservations about the topic of advertising capabilities, since that implies that there will be enough of them that layers need to be built on top to abstract over it and wrangle the whole thing. As I stated up here, simplicity has been a pretty big part of Singer in general, and some of these capabilities definitely feel more related to the orchestration rather than a standard. That said, we have done a lot of work to make That way it'd be like: I wonder about the "other" and/or whether it should just be documentation, but that's a different topic 😅 |
Working on this in my fork of the repository here: https://github.com/dmosorast/Singer-Working-Group/tree/19-layered-model-of-singer |
When we talk about Singer, it's important to be sure that we're talking about the same things. One way to accomplish that is to organize around language. This proposal is to define a known set of "buckets" that we can use to help us as a group organize what is a Spec change, and what is a standard. What follows is my gut feeling of how these divisions could work, and is by no means completely correct or complete. 😄
Without further ado, I'd like to propose...
The Layered Model of Singer
Looking at Singer, there are a lot of design choices baked in around a core value of simplicity. The reasoning for this has always been to give developers the freedom and flexibility to make it what they want, since all data sources are vastly different, and one cannot effectively design for all future cases in the ELT space.
As we discuss evolving Singer as a whole and as a community, it will be important to take care to not lose the core value of simplicity that has allowed the space for best practices to be invented like those encoded in the current existing frameworks/libraries.
Approaching the stack as a layered model can give us a means of aligning where an idea fits, and a tool iterate organically to "upgrade" concepts from a framework feature to a codified standard to a spec change if it makes sense.
Layer 1: Specification
This is the current specification as it stands, some principles of features here:
Layer 2: Standards and Best Practices
These are being tracked in #10, but as far as the initial design decisions of Singer go, this conceptually includes things like Command-Line Arguments, Catalog, Metadata Keys/Custom Metadata, Standard State Keys, etc.
Some principles here:
Layer 3: LIbraries and Frameworks
This is where we get into the language specific stuff. Libraries like
singer-python
and/orsinger-clojure
or frameworks like the MeltanoSDK take the standards plus best practices and encode them in a way that makes sense for the patterns of each language. This is also a good place to be a test bed for things that might become standards.Principles:
Layer 4: Tooling/Orchestration/UX/Infrastructure
I'm not quite sure about this one, but these are things that don't seem to fit in the other layers, and kind of make up an analog to the "Application Layer" of the OSI layered model. This layer is included to be a spot to hold things that are in use on a specific industry, use case, deployment method, etc., but not quite ready to be standardized.
[Aside:] This layer could use the most work, but it seemed worth including here. My gut says that it's likely harder to standardize these kinds of things, since it'll be where our orgs' respective product offerings fall into a lot of the time, and with that comes IP concerns, specifics for our target users (e.g., technical vs. non-technical), a specific slice of the industry, and/or a more narrow set of use cases. That said, tools like
singer-discover
would also fall here, and fit into a standardization conversation more easily.That's what I currently have been kicking around for this idea, and am excited to get it out there for feedback, very curious about thoughts on the specific categories as well as whether this approach is a good idea. I'd like to eventually get this defined enough to make it into a SIP to officially propose a model. Thanks for checking it out! I appreciate all feedback 🚀
The text was updated successfully, but these errors were encountered: