Skip to content

API endpoints for the series deletion API using block storage #4370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
f5deb2e
Add endpoints for blocks series deletion API
ilangofman Jul 13, 2021
220a47e
Add unit tests and address PR comments
ilangofman Jul 14, 2021
6f7d749
change 1 function name
ilangofman Jul 14, 2021
1f89173
Remove the deletion during get requests
ilangofman Jul 15, 2021
2feb3ce
make the hashing consistent no matter the selector order
ilangofman Jul 15, 2021
342d673
fix edge case
ilangofman Jul 18, 2021
361d166
Fix feature flag
ilangofman Jul 18, 2021
f1fc186
refactor minor
ilangofman Jul 19, 2021
f9c16f6
minor refactor
ilangofman Jul 19, 2021
8f0338f
Fix lint errors
ilangofman Jul 19, 2021
a5e62f0
fix lint errors and add changelog
ilangofman Jul 19, 2021
6836cfb
Merge branch 'master' into block_deletion_endpoints
ilangofman Jul 19, 2021
9d090a3
Latest changes from master branch
ilangofman Aug 6, 2021
16729d3
update changelog
ilangofman Aug 6, 2021
4bafb4e
fix Changelog
ilangofman Aug 6, 2021
1dfaf28
Remove extra print statement
ilangofman Aug 6, 2021
08b4c1e
Minor refactor
ilangofman Aug 16, 2021
b811d2e
Add a tombstone manager to simplify the creation/getting tombstones
ilangofman Aug 16, 2021
4aa7dd3
Merge branch 'master' into block_deletion_endpoints
ilangofman Aug 17, 2021
a0524cd
remove unneeded variable
ilangofman Aug 17, 2021
06aec3d
Merge branch 'block_deletion_endpoints' of https://github.com/ilangof…
ilangofman Aug 17, 2021
3b0ec49
fix comment
ilangofman Aug 17, 2021
212f876
Address PR comments
ilangofman Aug 19, 2021
7d6963d
reverse change of combining blocks purger and tenant deletion
ilangofman Sep 20, 2021
fd11ec1
Undo change of combining tenant deletion and series deletion API files.
ilangofman Sep 20, 2021
d9c5b92
Merge branch 'master' into block_deletion_endpoints
Jul 11, 2022
9263ca6
Make changes based on PR comments
Jul 13, 2022
06cb921
fix return content type of get delete requests
Jul 13, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Changelog

## master / unreleased

* [FEATURE] Block Storage: Added Prometheus style API endpoints for series deletion. Needs to be enabled first by setting `--purger.enable` to `true`. This only handles the creating, getting and cancelling requests. Actual deletion and query time filtering will be part of future PRs. #4370

## 1.13.0 in progress
* [CHANGE] Changed default for `-ingester.min-ready-duration` from 1 minute to 15 seconds. #4539
Expand Down
13 changes: 13 additions & 0 deletions pkg/api/api.go
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,19 @@ func (a *API) RegisterTenantDeletion(api *purger.TenantDeletionAPI) {
a.RegisterRoute("/purger/delete_tenant_status", http.HandlerFunc(api.DeleteTenantStatus), true, "GET")
}

func (a *API) RegisterBlocksPurger(blocksPurger *purger.BlocksPurgerAPI) {

a.RegisterRoute(path.Join(a.cfg.PrometheusHTTPPrefix, "/api/v1/admin/tsdb/delete_series"), http.HandlerFunc(blocksPurger.AddDeleteRequestHandler), true, "PUT", "POST")
a.RegisterRoute(path.Join(a.cfg.PrometheusHTTPPrefix, "/api/v1/admin/tsdb/delete_series"), http.HandlerFunc(blocksPurger.GetAllDeleteRequestsHandler), true, "GET")
a.RegisterRoute(path.Join(a.cfg.PrometheusHTTPPrefix, "/api/v1/admin/tsdb/cancel_delete_request"), http.HandlerFunc(blocksPurger.CancelDeleteRequestHandler), true, "PUT", "POST")

// Legacy Routes
a.RegisterRoute(path.Join(a.cfg.LegacyHTTPPrefix, "/api/v1/admin/tsdb/delete_series"), http.HandlerFunc(blocksPurger.AddDeleteRequestHandler), true, "PUT", "POST")
a.RegisterRoute(path.Join(a.cfg.LegacyHTTPPrefix, "/api/v1/admin/tsdb/delete_series"), http.HandlerFunc(blocksPurger.AddDeleteRequestHandler), true, "GET")
a.RegisterRoute(path.Join(a.cfg.LegacyHTTPPrefix, "/api/v1/admin/tsdb/cancel_delete_request"), http.HandlerFunc(blocksPurger.CancelDeleteRequestHandler), true, "PUT", "POST")

}

// RegisterRuler registers routes associated with the Ruler service.
func (a *API) RegisterRuler(r *ruler.Ruler) {
a.indexPage.AddLink(SectionAdminEndpoints, "/ruler/ring", "Ruler Ring Status")
Expand Down
244 changes: 244 additions & 0 deletions pkg/chunk/purger/blocks_purger.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,244 @@
package purger

import (
"crypto/md5"
"encoding/binary"
"encoding/hex"
"encoding/json"
fmt "fmt"
"net/http"
"sort"
"time"

"github.com/go-kit/kit/log"
"github.com/go-kit/kit/log/level"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/common/model"
"github.com/prometheus/prometheus/model/labels"
"github.com/thanos-io/thanos/pkg/objstore"

"github.com/cortexproject/cortex/pkg/storage/bucket"
cortex_tsdb "github.com/cortexproject/cortex/pkg/storage/tsdb"
"github.com/cortexproject/cortex/pkg/tenant"
"github.com/cortexproject/cortex/pkg/util"
util_log "github.com/cortexproject/cortex/pkg/util/log"
)

type BlocksPurgerAPI struct {
bucketClient objstore.Bucket
logger log.Logger
cfgProvider bucket.TenantConfigProvider
deleteRequestCancelPeriod time.Duration
}

func NewBlocksPurgerAPI(storageCfg cortex_tsdb.BlocksStorageConfig, cfgProvider bucket.TenantConfigProvider, logger log.Logger, reg prometheus.Registerer, cancellationPeriod time.Duration) (*BlocksPurgerAPI, error) {
bucketClient, err := createBucketClient(storageCfg, logger, "blocks-purger", reg)
if err != nil {
return nil, err
}

return newBlocksPurgerAPI(bucketClient, cfgProvider, logger, cancellationPeriod), nil
}

func newBlocksPurgerAPI(bkt objstore.Bucket, cfgProvider bucket.TenantConfigProvider, logger log.Logger, cancellationPeriod time.Duration) *BlocksPurgerAPI {
return &BlocksPurgerAPI{
bucketClient: bkt,
cfgProvider: cfgProvider,
logger: logger,
deleteRequestCancelPeriod: cancellationPeriod,
}
}

func (api *BlocksPurgerAPI) AddDeleteRequestHandler(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
userID, err := tenant.TenantID(ctx)
if err != nil {
http.Error(w, err.Error(), http.StatusUnauthorized)
return
}

params := r.URL.Query()
match := params["match[]"]
if len(match) == 0 {
http.Error(w, "selectors not set", http.StatusBadRequest)
return
}

matchers, err := cortex_tsdb.ParseMatchers(match)
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}

startParam := params.Get("start")
startTime := int64(0)
if startParam != "" {
startTime, err = util.ParseTime(startParam)
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
}

endParam := params.Get("end")
endTime := int64(model.Now())

if endParam != "" {
endTime, err = util.ParseTime(endParam)
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}

if endTime > int64(model.Now()) {
http.Error(w, "deletes in future not allowed", http.StatusBadRequest)
return
}
}

if startTime > endTime {
http.Error(w, "start time can't be greater than end time", http.StatusBadRequest)
return
}

tManager := cortex_tsdb.NewTombstoneManager(api.bucketClient, userID, api.cfgProvider, api.logger)

requestID := getTombstoneHash(startTime, endTime, matchers)
// Since the request id is based on a hash of the parameters, there is a possibility that a tombstone could already exist for it
// if the request was previously cancelled, we need to remove the cancelled tombstone before adding the pending one
if err := tManager.RemoveCancelledStateIfExists(ctx, requestID); err != nil {
level.Error(util_log.Logger).Log("msg", "removing cancelled tombstone state if it exists", "err", err)
http.Error(w, "Error checking previous delete requests and removing the past cancelled version of this request if it exists ", http.StatusInternalServerError)
return
}

prevT, err := tManager.GetTombstoneByIDForUser(ctx, requestID)
if err != nil {
level.Error(util_log.Logger).Log("msg", "error getting delete request by id", "err", err)
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
if prevT != nil {
http.Error(w, "delete request tombstone with same information already exists", http.StatusBadRequest)
return
}

curTime := time.Now().Unix() * 1000
t := cortex_tsdb.NewTombstone(userID, curTime, curTime, startTime, endTime, match, requestID, cortex_tsdb.StatePending)

if err = tManager.WriteTombstone(ctx, t); err != nil {
level.Error(util_log.Logger).Log("msg", "error adding delete request to the object store", "err", err)
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}

w.WriteHeader(http.StatusNoContent)
}

func (api *BlocksPurgerAPI) GetAllDeleteRequestsHandler(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
userID, err := tenant.TenantID(ctx)
if err != nil {
http.Error(w, err.Error(), http.StatusUnauthorized)
return
}
Comment on lines +140 to +144
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty common code, I think existing Cortex code uses middleware.AuthenticateUser, any reason why we can't use the same function for delete series APIs?


tManager := cortex_tsdb.NewTombstoneManager(api.bucketClient, userID, api.cfgProvider, api.logger)
deleteRequests, err := tManager.GetAllTombstonesForUser(ctx)
if err != nil {
level.Error(util_log.Logger).Log("msg", "error getting delete requests from the block store", "err", err)
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}

w.Header().Set("Content-Type", "application/json")
if err := json.NewEncoder(w).Encode(deleteRequests); err != nil {
level.Error(util_log.Logger).Log("msg", "error marshalling response", "err", err)
http.Error(w, fmt.Sprintf("Error marshalling response: %v", err), http.StatusInternalServerError)
return
}
}

func (api *BlocksPurgerAPI) CancelDeleteRequestHandler(w http.ResponseWriter, r *http.Request) {
ctx := r.Context()
userID, err := tenant.TenantID(ctx)
if err != nil {
http.Error(w, err.Error(), http.StatusUnauthorized)
return
}

params := r.URL.Query()
requestID := params.Get("request_id")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we do input validation here? What if request_id is not given?

if len(requestID) == 0 {
http.Error(w, "request_id not set", http.StatusBadRequest)
return
}

tManager := cortex_tsdb.NewTombstoneManager(api.bucketClient, userID, api.cfgProvider, api.logger)
deleteRequest, err := tManager.GetTombstoneByIDForUser(ctx, requestID)
if err != nil {
level.Error(util_log.Logger).Log("msg", "error getting delete request from the object store", "err", err)
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}

if deleteRequest == nil {
http.Error(w, "could not find delete request with given id", http.StatusBadRequest)
return
}

if deleteRequest.State == cortex_tsdb.StateCancelled {
http.Error(w, "the series deletion request was cancelled previously", http.StatusAccepted)
return
}

if deleteRequest.State == cortex_tsdb.StateProcessed {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a "processing" state? Or we can rely on the deleteRequestCancelPeriod to infer the "processing" state?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can use the deleteRequestCancelPeriod to infer the processing state. Whenever the tombstone with statePending is older than deleteRequestCancelPeriod, then it is in processing state.

http.Error(w, "deletion of request which is already processed is not allowed", http.StatusBadRequest)
return
}

if time.Since(deleteRequest.GetCreateTime()) > api.deleteRequestCancelPeriod {
http.Error(w, fmt.Sprintf("Cancellation of request past the deadline of %s since its creation is not allowed", api.deleteRequestCancelPeriod.String()), http.StatusBadRequest)
return
}

// create file with the cancelled state
_, err = tManager.UpdateTombstoneState(ctx, deleteRequest, cortex_tsdb.StateCancelled)
if err != nil {
level.Error(util_log.Logger).Log("msg", "error cancelling the delete request", "err", err)
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}

w.WriteHeader(http.StatusNoContent)
}

func getTombstoneHash(startTime int64, endTime int64, selectors []*labels.Matcher) string {
// Any delete request with the same start, end time and same selectors should result in the same hash

hash := md5.New()

bufStart := make([]byte, 8)
binary.LittleEndian.PutUint64(bufStart, uint64(startTime))
bufEnd := make([]byte, 8)
binary.LittleEndian.PutUint64(bufEnd, uint64(startTime))

hash.Write(bufStart)
hash.Write(bufEnd)

// First we get the strings of the parsed matchers which
// then are sorted and hashed after. This is done so that logically
// equivalent deletion requests result in the same hash
selectorStrings := make([]string, len(selectors))
for i, s := range selectors {
selectorStrings[i] = s.String()
}

sort.Strings(selectorStrings)
for _, s := range selectorStrings {
hash.Write([]byte(s))
}

md5Bytes := hash.Sum(nil)
return hex.EncodeToString(md5Bytes[:])
}
Loading