Skip to content

Commit cb9c95c

Browse files
authored
Merge pull request #16 from brevdev/devin/1754698090-add-provider-guide
docs: add 'How to add a provider' guide and link it in README (GPT5)
2 parents 74afed2 + bc2028a commit cb9c95c

File tree

2 files changed

+327
-0
lines changed

2 files changed

+327
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,7 @@ See [SECURITY.md](docs/SECURITY.md) for complete security specifications and imp
7171
- **[V1 Design Notes](pkg/v1/V1_DESIGN_NOTES.md)**: Design decisions, known quirks, and AWS-inspired patterns in the v1 API
7272
- **[Architecture Overview](docs/ARCHITECTURE.md)**: How the Cloud SDK fits into Brev's overall architecture
7373
- **[Security Requirements](docs/SECURITY.md)**: Security specifications and implementation requirements
74+
- **[How to Add a Provider](docs/how-to-add-a-provider.md)**: Step-by-step guide to implement a new cloud provider using the Lambda Labs example
7475

7576
---
7677

docs/how-to-add-a-provider.md

Lines changed: 326 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,326 @@
1+
# How to Add a Cloud Provider
2+
3+
This guide explains how to add a new cloud provider to the Brev Cloud SDK (v1). The Lambda Labs provider is the best working, well-tested example—use it as your canonical reference.
4+
5+
Goals:
6+
- Implement a provider-specific CloudCredential (factory) and CloudClient (implementation) that satisfy pkg/v1 interfaces.
7+
- Accurately declare Capabilities based on the provider’s API surface.
8+
- Implement at least instance lifecycle and instance types, adhering to security requirements.
9+
- Add validation tests and (optionally) a GitHub Actions workflow to run them with real credentials.
10+
11+
Helpful background:
12+
- Architecture overview: ../docs/ARCHITECTURE.md
13+
- Security requirements: ../docs/SECURITY.md
14+
- Validation testing framework: ../docs/VALIDATION_TESTING.md
15+
- v1 design notes: ../pkg/v1/V1_DESIGN_NOTES.md
16+
17+
Provider examples:
18+
- Lambda Labs (canonical): ../internal/lambdalabs/v1/README.md
19+
- Nebius (in progress): ../internal/nebius/v1/README.md
20+
- Fluidstack (in progress): ../internal/fluidstack/v1/README.md
21+
22+
---
23+
24+
## Core v1 Interfaces You Must Target
25+
26+
CloudClient is a composed interface of provider capabilities. You don’t need to implement everything—only what your provider supports—but you must advertise Capabilities correctly.
27+
28+
- CloudClient composition: ../pkg/v1/client.go
29+
- Key aggregation: CloudBase, CloudQuota, CloudRebootInstance, CloudStopStartInstance, CloudResizeInstanceVolume, CloudMachineImage, CloudChangeInstanceType, CloudModifyFirewall, CloudInstanceTags, UpdateHandler
30+
- Capabilities system: ../pkg/v1/capabilities.go
31+
- Instance lifecycle, validation helpers, and types: ../pkg/v1/instance.go
32+
- Instance types and validation helpers: ../pkg/v1/instancetype.go
33+
34+
Patterns to follow:
35+
- Embed v1.NotImplCloudClient in your client so unsupported methods gracefully return ErrNotImplemented (see ../pkg/v1/notimplemented.go).
36+
- Accurately return capability flags that match your provider’s real API.
37+
- Prefer stable, provider-native identifiers; otherwise use MakeGenericInstanceTypeID/MakeGenericInstanceTypeIDFromInstance.
38+
39+
---
40+
41+
## Directory Layout
42+
43+
Create a new provider folder:
44+
45+
- internal/{provider}/
46+
- SECURITY.md (provider-specific notes; link to top-level security expectations)
47+
- CONTRIBUTE.md (optional provider integration notes)
48+
- v1/
49+
- client.go (credentials and client)
50+
- instance.go (instance lifecycle + helpers)
51+
- instancetype.go (instance types)
52+
- capabilities.go (capability declarations)
53+
- networking.go, image.go, storage.go, tags.go, quota.go, location.go (as applicable)
54+
- validation_test.go (validation suite entry point)
55+
56+
Use Lambda Labs as the pattern:
57+
- ../internal/lambdalabs/v1/client.go
58+
- ../internal/lambdalabs/v1/instance.go
59+
- ../internal/lambdalabs/v1/capabilities.go
60+
61+
---
62+
63+
## Minimal Scaffold (Copy/Paste Template)
64+
65+
Place in internal/{provider}/v1/client.go. Adjust names, imports, and fields for your provider.
66+
67+
```go
68+
package v1
69+
70+
import (
71+
"context"
72+
73+
v1 "github.com/brevdev/cloud/pkg/v1"
74+
)
75+
76+
type {Provider}Credential struct {
77+
RefID string
78+
// Add auth fields (e.g., APIKey, ClientID, Secret, Tenant, etc.)
79+
}
80+
81+
var _ v1.CloudCredential = &{Provider}Credential{}
82+
83+
func New{Provider}Credential(refID string /* auth fields */) *{Provider}Credential {
84+
return &{Provider}Credential{
85+
RefID: refID,
86+
// ...
87+
}
88+
}
89+
90+
func (c *{Provider}Credential) GetReferenceID() string { return c.RefID }
91+
func (c *{Provider}Credential) GetAPIType() v1.APIType { return v1.APITypeLocational /* or v1.APITypeGlobal */ }
92+
func (c *{Provider}Credential) GetCloudProviderID() v1.CloudProviderID {
93+
return "{provider-id}" // e.g., "lambdalabs"
94+
}
95+
func (c *{Provider}Credential) GetTenantID() (string, error) {
96+
// Derive stable tenant ID for quota/account scoping if possible
97+
return "", nil
98+
}
99+
100+
func (c *{Provider}Credential) GetCapabilities(_ context.Context) (v1.Capabilities, error) {
101+
return get{Provider}Capabilities(), nil
102+
}
103+
104+
func (c *{Provider}Credential) MakeClient(ctx context.Context, location string) (v1.CloudClient, error) {
105+
// Create a client configured for a given location if locational API
106+
return New{Provider}Client(c.RefID /* auth fields */).MakeClient(ctx, location)
107+
}
108+
109+
// ---------------- Client ----------------
110+
111+
type {Provider}Client struct {
112+
v1.NotImplCloudClient
113+
refID string
114+
location string
115+
// add http/sdk client fields, base URLs, etc.
116+
}
117+
118+
var _ v1.CloudClient = &{Provider}Client{}
119+
120+
func New{Provider}Client(refID string /* auth fields */) *{Provider}Client {
121+
return &{Provider}Client{
122+
refID: refID,
123+
// init http/sdk clients here
124+
}
125+
}
126+
127+
func (c *{Provider}Client) GetAPIType() v1.APIType { return v1.APITypeLocational /* or Global */ }
128+
func (c *{Provider}Client) GetCloudProviderID() v1.CloudProviderID { return "{provider-id}" }
129+
func (c *{Provider}Client) GetReferenceID() string { return c.refID }
130+
func (c *{Provider}Client) GetTenantID() (string, error) { return "", nil }
131+
132+
func (c *{Provider}Client) MakeClient(_ context.Context, location string) (v1.CloudClient, error) {
133+
c.location = location
134+
return c, nil
135+
}
136+
```
137+
138+
Declare capabilities in internal/{provider}/v1/capabilities.go:
139+
140+
```go
141+
package v1
142+
143+
import (
144+
"context"
145+
146+
v1 "github.com/brevdev/cloud/pkg/v1"
147+
)
148+
149+
func get{Provider}Capabilities() v1.Capabilities {
150+
return v1.Capabilities{
151+
v1.CapabilityCreateInstance,
152+
v1.CapabilityTerminateInstance,
153+
v1.CapabilityCreateTerminateInstance,
154+
// add others supported by your provider: reboot, stop/start, machine-image, tags, resize-volume, modify-firewall, etc.
155+
}
156+
}
157+
158+
func (c *{Provider}Client) GetCapabilities(_ context.Context) (v1.Capabilities, error) {
159+
return get{Provider}Capabilities(), nil
160+
}
161+
162+
func (c *{Provider}Credential) GetCapabilities(_ context.Context) (v1.Capabilities, error) {
163+
return get{Provider}Capabilities(), nil
164+
}
165+
```
166+
167+
Implement instance lifecycle in internal/{provider}/v1/instance.go (map to provider API):
168+
169+
```go
170+
package v1
171+
172+
import (
173+
"context"
174+
"fmt"
175+
176+
v1 "github.com/brevdev/cloud/pkg/v1"
177+
)
178+
179+
func (c *{Provider}Client) CreateInstance(ctx context.Context, attrs v1.CreateInstanceAttrs) (*v1.Instance, error) {
180+
// 1) ensure SSH key present (or inject via API) per ../docs/SECURITY.md
181+
// 2) map attrs to provider request (location, instance type, image, tags, firewall rules if supported)
182+
// 3) launch and return instance converted to v1.Instance
183+
return nil, fmt.Errorf("not implemented")
184+
}
185+
186+
func (c *{Provider}Client) GetInstance(ctx context.Context, id v1.CloudProviderInstanceID) (*v1.Instance, error) {
187+
return nil, fmt.Errorf("not implemented")
188+
}
189+
190+
func (c *{Provider}Client) ListInstances(ctx context.Context, args v1.ListInstancesArgs) ([]v1.Instance, error) {
191+
return nil, fmt.Errorf("not implemented")
192+
}
193+
194+
func (c *{Provider}Client) TerminateInstance(ctx context.Context, id v1.CloudProviderInstanceID) error {
195+
return fmt.Errorf("not implemented")
196+
}
197+
198+
// Optional if supported:
199+
func (c *{Provider}Client) RebootInstance(ctx context.Context, id v1.CloudProviderInstanceID) error { return fmt.Errorf("not implemented") }
200+
func (c *{Provider}Client) StopInstance(ctx context.Context, id v1.CloudProviderInstanceID) error { return fmt.Errorf("not implemented") }
201+
func (c *{Provider}Client) StartInstance(ctx context.Context, id v1.CloudProviderInstanceID) error { return fmt.Errorf("not implemented") }
202+
203+
// Merge strategies (pass-through is acceptable baseline).
204+
func (c *{Provider}Client) MergeInstanceForUpdate(_ v1.Instance, newInst v1.Instance) v1.Instance { return newInst }
205+
func (c *{Provider}Client) MergeInstanceTypeForUpdate(_ v1.InstanceType, newIt v1.InstanceType) v1.Type { return newIt }
206+
```
207+
208+
See the canonical mapping and conversion logic in Lambda Labs:
209+
- Create/terminate/list/reboot: ../internal/lambdalabs/v1/instance.go
210+
- Capabilities: ../internal/lambdalabs/v1/capabilities.go
211+
- Client/credential + NotImpl: ../internal/lambdalabs/v1/client.go
212+
213+
Implement instance types in internal/{provider}/v1/instancetype.go:
214+
215+
- Implement:
216+
- GetInstanceTypes(ctx, args GetInstanceTypeArgs) ([]InstanceType, error)
217+
- GetInstanceTypePollTime() time.Duration
218+
- Use stable IDs if provider offers them. If not, use MakeGenericInstanceTypeID.
219+
- Validate with helpers:
220+
- ValidateGetInstanceTypes: ../pkg/v1/instancetype.go
221+
- ValidateLocationalInstanceTypes: ../pkg/v1/instancetype.go
222+
- ValidateStableInstanceTypeIDs (if you maintain a stable ID list)
223+
224+
---
225+
226+
## Capabilities: Be Precise
227+
228+
Capability flags live in ../pkg/v1/capabilities.go. Only include capabilities your API actually supports. For example, Lambda Labs supports:
229+
- Create/terminate/reboot instance
230+
- Does not (currently) support stop/start, resize volume, machine image, tags
231+
232+
Reference:
233+
- Lambda capabilities: ../internal/lambdalabs/v1/capabilities.go
234+
235+
---
236+
237+
## Security Requirements
238+
239+
All providers must conform to ../docs/SECURITY.md:
240+
- Default deny all inbound, allow all outbound
241+
- SSH server must be available with key-based auth
242+
- Firewall rules should be explicitly configured via FirewallRule when supported
243+
- If your provider’s firewall model is global/project-scoped rather than per-instance, document limitations in internal/{provider}/SECURITY.md and reflect that by omitting CapabilityModifyFirewall if applicable.
244+
245+
Provider-specific security doc examples:
246+
- Lambda Labs: ../internal/lambdalabs/SECURITY.md
247+
- Nebius: ../internal/nebius/SECURITY.md
248+
- Fluidstack: ../internal/fluidstack/v1/SECURITY.md
249+
250+
---
251+
252+
## Validation Testing and CI
253+
254+
Use the shared validation suite to test your provider with real credentials.
255+
256+
- Validation framework and instructions: ../docs/VALIDATION_TESTING.md
257+
- Shared package: ../internal/validation/suite.go
258+
259+
Steps:
260+
1) Create internal/{provider}/v1/validation_test.go:
261+
262+
```go
263+
package v1
264+
265+
import (
266+
"os"
267+
"testing"
268+
269+
"github.com/brevdev/cloud/internal/validation"
270+
)
271+
272+
func TestValidationFunctions(t *testing.T) {
273+
if testing.Short() {
274+
t.Skip("Skipping validation tests in short mode")
275+
}
276+
277+
apiKey := os.Getenv("YOUR_PROVIDER_API_KEY")
278+
if apiKey == "" {
279+
t.Skip("YOUR_PROVIDER_API_KEY not set, skipping validation tests")
280+
}
281+
282+
cfg := validation.ProviderConfig{
283+
Credential: New{Provider}Credential("validation-test" /* auth fields from env, e.g., apiKey */),
284+
}
285+
validation.RunValidationSuite(t, cfg)
286+
}
287+
```
288+
289+
2) Local runs:
290+
- make test # skips validation (short)
291+
- make test-validation # runs validation (long)
292+
- make test-all # runs everything
293+
294+
3) CI workflow (recommended):
295+
- Add .github/workflows/validation-{provider}.yml (copy Lambda Labs workflow if available or follow VALIDATION_TESTING.md).
296+
- Store secrets in GitHub Actions (e.g., YOUR_PROVIDER_API_KEY).
297+
298+
---
299+
300+
## Checklist
301+
302+
- [ ] Add internal/{provider}/v1 with client.go, instance.go, capabilities.go, instancetype.go
303+
- [ ] Embed v1.NotImplCloudClient in client and only implement supported methods
304+
- [ ] Accurately set Capabilities
305+
- [ ] Implement instance types with stable IDs where possible
306+
- [ ] Conform to security model; document provider-specific nuances
307+
- [ ] Add validation_test.go and (optionally) CI workflow
308+
- [ ] Run make lint and make test locally
309+
- [ ] Add provider docs (README.md under provider folder) describing API mapping and feature coverage
310+
311+
---
312+
313+
## References
314+
315+
- Architecture: ../docs/ARCHITECTURE.md
316+
- Security: ../docs/SECURITY.md
317+
- Validation testing: ../docs/VALIDATION_TESTING.md
318+
- CloudClient and composition: ../pkg/v1/client.go
319+
- Capabilities: ../pkg/v1/capabilities.go
320+
- Instance lifecycle and validations: ../pkg/v1/instance.go
321+
- Instance types and validations: ../pkg/v1/instancetype.go
322+
- Lambda Labs example:
323+
- Client/Credential: ../internal/lambdalabs/v1/client.go
324+
- Capabilities: ../internal/lambdalabs/v1/capabilities.go
325+
- Instance operations: ../internal/lambdalabs/v1/instance.go
326+
- Provider README: ../internal/lambdalabs/v1/README.md

0 commit comments

Comments
 (0)