-
Notifications
You must be signed in to change notification settings - Fork 4
Add alternative provider retrieval check #132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 16 commits
aedec80
233cc1f
83e7f31
afe30dd
23ee203
4bc1076
63424ff
8a94f4e
c4350b6
edfdef1
dbf0fd7
d33f276
97bee91
4b6d0bc
97fcc28
5121a49
4065784
74f06e9
ea8cce4
f9afe34
9959b50
a2da050
9759d80
820e8a3
5b13287
3c14f84
fe0f1f5
ad8a8e8
31019d0
3710910
c61a196
d1f62fa
05bd1c2
3451eff
8b8db36
59d3d22
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,15 @@ | ||
import { decodeBase64, decodeVarint, pRetry, assertOkResponse } from '../vendor/deno-deps.js' | ||
|
||
/** @typedef {{ address: string; protocol: string; contextId: string; }} Provider */ | ||
|
||
/** | ||
* | ||
* @param {string} cid | ||
* @param {string} providerId | ||
* @returns {Promise<{ | ||
* indexerResult: string; | ||
* provider?: { address: string; protocol: string }; | ||
* provider?: Provider; | ||
* providers?: Provider[]; | ||
* }>} | ||
*/ | ||
export async function queryTheIndex(cid, providerId) { | ||
|
@@ -31,9 +34,8 @@ export async function queryTheIndex(cid, providerId) { | |
} | ||
|
||
let graphsyncProvider | ||
const providers = [] | ||
pyropy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
for (const p of providerResults) { | ||
if (p.Provider.ID !== providerId) continue | ||
|
||
const [protocolCode] = decodeVarint(decodeBase64(p.Metadata)) | ||
const protocol = { | ||
0x900: 'bitswap', | ||
|
@@ -45,22 +47,33 @@ export async function queryTheIndex(cid, providerId) { | |
const address = p.Provider.Addrs[0] | ||
if (!address) continue | ||
|
||
switch (protocol) { | ||
case 'http': | ||
return { | ||
indexerResult: 'OK', | ||
provider: { address, protocol }, | ||
} | ||
const provider = { | ||
address: formatProviderAddress(p.Provider.ID, address, protocol), | ||
contextId: p.ContextID, | ||
protocol, | ||
} | ||
|
||
if (p.Provider.ID === providerId) { | ||
switch (protocol) { | ||
case 'http': | ||
return { | ||
indexerResult: 'OK', | ||
provider, | ||
} | ||
|
||
case 'graphsync': | ||
if (!graphsyncProvider) { | ||
graphsyncProvider = { | ||
address: `${address}/p2p/${p.Provider.ID}`, | ||
protocol, | ||
case 'graphsync': | ||
if (!graphsyncProvider) { | ||
graphsyncProvider = provider | ||
} | ||
} | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's confusing to me that in case of HTTP we return immediately, but in case of graph sync we don't. I don't see a reason for this behavior. Could we also directly return in the case of graphsync? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was the way we use to do it even before introduction of "alternative providers" to the response. As I understood it: we're iterating trough list of providers and in case we find record serving content via graphsync first we'll return it only if there's no HTTP alternative. In case there's a record serving content via HTTP we return it immediately. |
||
|
||
// Skip adding the provider to the list if it's the the one we are looking for | ||
continue | ||
} | ||
|
||
providers.push(provider) | ||
} | ||
|
||
if (graphsyncProvider) { | ||
console.log('HTTP protocol is not advertised, falling back to Graphsync.') | ||
return { | ||
|
@@ -70,7 +83,7 @@ export async function queryTheIndex(cid, providerId) { | |
} | ||
|
||
console.log('All advertisements are from other miners or for unsupported protocols.') | ||
return { indexerResult: 'NO_VALID_ADVERTISEMENT' } | ||
return { indexerResult: 'NO_VALID_ADVERTISEMENT', providers } | ||
} | ||
|
||
async function getRetrievalProviders(cid) { | ||
|
@@ -81,3 +94,7 @@ async function getRetrievalProviders(cid) { | |
const result = await res.json() | ||
return result.MultihashResults.flatMap((r) => r.ProviderResults) | ||
} | ||
|
||
function formatProviderAddress(id, address, protocol) { | ||
pyropy marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return protocol === 'http' ? address : `${address}/p2p/${id}` | ||
} |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -1,5 +1,6 @@ | ||||||
/* global Zinnia */ | ||||||
|
||||||
/** @import { Provider } from './ipni-client.js' */ | ||||||
import { ActivityState } from './activity-state.js' | ||||||
import { | ||||||
SPARK_VERSION, | ||||||
|
@@ -71,11 +72,25 @@ export default class Spark { | |||||
} | ||||||
|
||||||
console.log(`Querying IPNI to find retrieval providers for ${retrieval.cid}`) | ||||||
const { indexerResult, provider } = await queryTheIndex(retrieval.cid, stats.providerId) | ||||||
const { indexerResult, provider, providers } = await queryTheIndex( | ||||||
retrieval.cid, | ||||||
stats.providerId, | ||||||
) | ||||||
stats.indexerResult = indexerResult | ||||||
|
||||||
const providerFound = indexerResult === 'OK' || indexerResult === 'HTTP_NOT_ADVERTISED' | ||||||
if (!providerFound) return | ||||||
const noValidAdvertisement = indexerResult === 'NO_VALID_ADVERTISEMENT' | ||||||
|
||||||
// In case index lookup failed due to network error or CID not found, | ||||||
// we will not perform any retrieval | ||||||
pyropy marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
if (!providerFound && !noValidAdvertisement) return | ||||||
|
||||||
// In case we fail to find a valid advertisement for the provider | ||||||
// we will try to perform network wide retrieval from other providers | ||||||
if (noValidAdvertisement) { | ||||||
console.log('No valid advertisement found. Trying to retrieve from an alternative provider...') | ||||||
return await this.checkRetrievalFromAlternativeProvider(providers, retrieval.cid, stats) | ||||||
pyropy marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
} | ||||||
|
||||||
stats.protocol = provider.protocol | ||||||
stats.providerAddress = provider.address | ||||||
|
@@ -202,6 +217,29 @@ export default class Spark { | |||||
} | ||||||
} | ||||||
|
||||||
async testNetworkRetrieval(providers, cid, stats) { | ||||||
if (!providers.length) { | ||||||
console.info('No alternative providers found for the CID.') | ||||||
pyropy marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
return | ||||||
} | ||||||
|
||||||
stats.networkRetrieval = newNetworkRetrievalStats() | ||||||
const randomProvider = pickRandomProvider(providers) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm surprised by this strategy. I might be out of sync with the current product plan, but wasn't it the goal to see if any provider can fetch the data? Here we're marking a failure if the 2nd provider we try fails, but don't try the other ones. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've posted my opinions on that in this comment and I'm happy to discuss more! Other motivation behind this was not to put too much pressure on the network serving content and on the checkers. I'm afraid of the worst case situation where we'd have to check every provider serving content. |
||||||
if (!randomProvider) { | ||||||
console.warn( | ||||||
'No providers serving the content via HTTP or Graphsync found. Skipping network-wide retrieval check.', | ||||||
) | ||||||
return | ||||||
} | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we prevent this case earlier, when we decide whether to run this function in the first place? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, It's possible to prevent it by filtering alternative providers by their protocol. If we only have alternative providers that are serving content via |
||||||
|
||||||
await this.fetchCAR( | ||||||
randomProvider.protocol, | ||||||
randomProvider.address, | ||||||
cid, | ||||||
stats.networkRetrieval, | ||||||
) | ||||||
} | ||||||
|
||||||
async submitMeasurement(task, stats) { | ||||||
console.log('Submitting measurement...') | ||||||
const payload = { | ||||||
|
@@ -315,6 +353,16 @@ export function newStats() { | |||||
carChecksum: null, | ||||||
statusCode: null, | ||||||
headStatusCode: null, | ||||||
networkRetrieval: null, | ||||||
} | ||||||
} | ||||||
|
||||||
function newNetworkRetrievalStats() { | ||||||
return { | ||||||
statusCode: null, | ||||||
timeout: false, | ||||||
endAt: null, | ||||||
carTooLarge: false, | ||||||
} | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please include the selected There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a bit problematic as for providerId we need to make sure that provider is a Filecoin storage provider and that we have their miner Id. Are you aware of some way to reverse Peer ID to Miner ID? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's use the retrieval provider peer ID found in the IPNI response instead of the Filecoin miner ID. We are already doing that for the "regular" retrieval checks, see here: Lines 54 to 55 in 9c29967
|
||||||
} | ||||||
|
||||||
|
@@ -395,3 +443,65 @@ function mapErrorToStatusCode(err) { | |||||
// Fallback code for unknown errors | ||||||
return 600 | ||||||
} | ||||||
|
||||||
/** | ||||||
* Assigns weights to providers based on their protocol and context ID and picks one at random. | ||||||
* Providers with higher weights have a higher chance of being selected. | ||||||
* | ||||||
* Providers serving content using Bitswap protocol are filtered out. | ||||||
* | ||||||
* @param {Provider[]} providers | ||||||
* @returns {Provider | undefined} | ||||||
*/ | ||||||
export function pickRandomProvider(providers) { | ||||||
const filteredProviders = providers.filter((provider) => provider.protocol !== 'bitswap') | ||||||
if (!filteredProviders.length) return | ||||||
const weightedProviders = weighProviders(filteredProviders) | ||||||
return pickRandomWeightedItem(weightedProviders) | ||||||
} | ||||||
|
||||||
/** | ||||||
* Assigns weights to providers based on their protocol and context ID. | ||||||
* | ||||||
* HTTP providers and those whose context ID starts with 'gHa' are given higher weights, | ||||||
* hence having a higher chance of being selected. | ||||||
* | ||||||
* @param {Provider[]} providers | ||||||
* @returns {Provider & { weight: number }[]} | ||||||
*/ | ||||||
function weighProviders(providers) { | ||||||
pyropy marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
const protocolWeights = { http: 2, graphsync: 1 } | ||||||
|
||||||
// assign weight to each provider | ||||||
return providers.map((provider) => { | ||||||
let weight = protocolWeights[provider.protocol] | ||||||
if (provider.contextId.startsWith('ghsA')) weight += 1 | ||||||
pyropy marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
return { | ||||||
...provider, | ||||||
weight, | ||||||
} | ||||||
}) | ||||||
} | ||||||
|
||||||
/** | ||||||
* Picks a random item from an array based on their weight. The higher the weight, the higher the chance of being selected. | ||||||
* | ||||||
* @template T The type of the item in the list. | ||||||
* @param {Array<{weight: number}>} items The list of items, where each item has a `weight`property. | ||||||
* @returns {T} The randomly selected item based on its weight. | ||||||
* | ||||||
*/ | ||||||
function pickRandomWeightedItem(items) { | ||||||
const totalWeight = items.reduce((acc, item) => acc + item.weight, 0) | ||||||
let random = Math.random() * totalWeight | ||||||
|
||||||
// Iterate over items, subtracting the item's weight from the random number | ||||||
// until we find the item where the random number is less than the item's weight | ||||||
for (let i = 0; i < items.length; i++) { | ||||||
random -= items[i].weight | ||||||
if (random <= 0) { | ||||||
return items[i] | ||||||
} | ||||||
} | ||||||
} | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find this very problematic: (1) Please use a DRAND beacon to get deterministic randomness so that all nodes pick the same alternative provider to check. See how we are using DRAND beacon in other parts of this codebase. The important part is to use the DRAND beacon tied to the time when the Spark round started. (2) I would replace weights with the following algorithm:
Alternatively, keep using weight, but pick randomly only from the providers with the same highest weight. Let's discuss what would be the best approach here! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for your input!
I wasn't aware that we also need to form committees for the second (alternative) measurement as commitees were formed to evaluate measurement for a single storage provider. I therefore assumed that we can pick any provider at random as this measurement should not affect existing RSR. In case we're going to use committees to evaluate this measurement I agree with using DRAND.
Alternative solution to your proposal would be to adjust the weights but I think the algorithm you proposed may be easier to understand. I don't mind changing the algorithm. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added both a pseudo-RNG and more a explicit algorithm. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it confusing to have two properties called
provider
andproviders
, I cannot tell what's the difference.How about calling the new property
alternativeProviders
oraltProviders
?