Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better progress API by reporting both done and total values #1

Open
darsain opened this issue Sep 12, 2017 · 13 comments
Open

Better progress API by reporting both done and total values #1

darsain opened this issue Sep 12, 2017 · 13 comments

Comments

@darsain
Copy link

darsain commented Sep 12, 2017

Current implementation of forcing us to report the progress fraction has a lot of limitations:

  1. Hides the done and total values data from callbacks. This data might be useful in various implementations.
  2. Takes away the ability to handle progress of jobs that don't know about the total value. Some operations just don't know the totalSize, but could still report the progress of how much data has been processed thus far.
  3. Inaccurate progress reporting in PProgress.all(). Without p-progress, or callback consumers knowing the totals, each job is equal no matter how much data it is processing.

I propose not enforcing the passing of fraction, but instead keeping track of both done and total values, reporting them, and letting the user decide how to handle it. This will give us more options and data in our progress implementations.

API:

// current
progress(done / total);
// proposal
progress(done, [total]);

The onProgress callback would than report these values in 1st and 2nd argument. If total is undefined, the end is unknown and user can switch handling:

progressPromise.onProgress((done, total) => {
  if (total === undefined) {
    console.log(`${done} bytes processed thus far`);
  } else {
    console.log(`${(done / total) * 100}% done`);
  }
});

In PProgress.all().onProgress callback, the done and total values in onProgress callback will be the total additions of both of the values of all passes progress promisses. And if any of the promisses is missing the total value, the resulting onProgress callback will also receive undefined as total, since we can no longer tell what the total of all promisses is now.

Huge benefit of knowing about the total total in PProgress.all() callback is a more accurate calculation of the whole progress. For example, if you have 2 jobs, one is processing 1KB, another is processing 99KB, and the 1st ends while the 2nd hasn't even started, you now have an accurate progress of 0.01, instead of 0.5. And if you do want it to report that 0.5, well you just do this in each job progress report:

progress(done / total, 1);

And now each job is equal no matter of the amount of data it's processing.

As you see, this kind of API would give us a lot more power, options, data, and ability to handle jobs with unknown totals in our implementations, while still remaining very simple.

@fregante
Copy link

fregante commented Sep 12, 2017

I assume you're talking about PProgress.all since the other two methods allow you to pass the progress directly.

How would p-progress calculate the total? How do you suggest the value of each promise (i.e. 0.01 and 0.99)? I think that's part of your own logic in a PProgress.fn

@darsain
Copy link
Author

darsain commented Sep 12, 2017

I assume you're talking about PProgress.all since the other two methods allow you to pass the progress directly.

No, this would change the progress reporting everywhere. Or maybe I don't understand the comment.

How would p-progress calculate the total?

There is no total calculation in basic PProgress instance. The done and total values received by progress() are simply stored and passed to the callback. These values can be anything, a fraction progress(0.1, 1) or an amount of data processed thus far progress(555, 1024). It is than upon the implementation to decide how to consume them. The simplest way would just be to do done / total and get the progress fraction. This provides more information and thus implementation options to the progress consumer.

The total argument is optional, to allow reporting of a progress even for jobs that don't have the total information available (they don't know how much more data is going to come in). And again, it's upon progress consuming implementation to handle a case where total is undefined.

The code would explain this better I think, so here is a simplified difference between current and proposed solutions:

// CURRENT
const progressFn = progress => {
	this._progress = progress;

	for (const listener of this._listeners) {
		listener(progress);
	}
}

// PROPOSED
const progressFn = (done, total) => {
	this._done = done;
	this._total = total;

	for (const listener of this._listeners) {
		listener(done, total);
	}
}

And here is the Promise.all() consolidation.

Notes:

  • all, including normal Promise() instances, default to done = 0; total = 1 and are set to done = 1; total = 1 after they have been resolved and their progress hasn't changed
  • if any of the promises sets total to undefined, the final total in the main consolidated callback is also going to be undefined
// CURRENT
const sum = iterable => {
	let total = 0;
	for (const value of iterable.values()) total += value;
	return total;
};
const progressMap = new Map();
const reportProgress = () => {
	progress(sum(progressMap) / promises.length);
};

for (const promise of promisses) {
	progressMap.set(promise, 0);

	promise.onProgress(percentage => {
		progressMap.set(promise, percentage);
		reportProgress();
	});

	// please ignore await inside for-of loop, this is a pMapping
	// function in the actual implementation
	await promise;
	progressMap.set(promise, 1);
	reportProgress();
}

// PROPOSED
const sum = (iterable, prop) => {
	let total = 0;
	for (const value of iterable.values()) {
		if (value[prop] === undefined) return undefined;
		total += value[prop];
	}
	return total;
};
const progressMap = new Map();
const reportProgress = () => {
	progress(sum(progressMap, 'done'), sum(progressMap, 'total'));
};

for (const promise of promisses) {
	progressMap.set(promise, {done: 0, total: 1});

	promise.onProgress((done, total) => {
		progressMap.set(promise, {done, total});
		reportProgress();
	});

	// please ignore await inside for-of loop, this is a pMapping
	// function in the actual implementation
	await promise;
	const {done, total} = progressMap.get(promise);
	if (done === 0 && total === 1) {
		progressMap.set(promise, {done: 1, total: 1});
	}
	reportProgress();
}

How do you suggest the value of each promise (i.e. 0.01 and 0.99)? I think that's part of your own logic in a PProgress.fn

Yes, my values in PProgress.all() inaccuracy description were implied to be done / total fractions.

@fregante
Copy link

fregante commented Sep 12, 2017

I may be a bit slow today but if total is not 1, I still don't know how you're gonna tell p-progress what that number is.

What's your suggested API to know what value of each promise and total are? Show the changes on this code:

const allProgressPromise = PProgress.all([
	thisPromiseIs1kb(),
    thisPromiseIs99kb(),
]);

(async () => {
	allProgressPromise.onProgress(console.log);
	//=> 1
	//=> 100

	await allProgressPromise;
})();

@darsain
Copy link
Author

darsain commented Sep 12, 2017

First of all, the console.log would always receive 2 arguments, the done and total values. Also, to simplify the examples, lets change it from kb to just numbers.

If thisPromiseIs1() reports 1 total, thisPromiseIs99() reports 99 total, and they report progress after 1, you'd get:

const allProgressPromise = PProgress.all([
	thisPromiseIs1(),  // always calls => progress(done, total)
	thisPromiseIs99(), // always calls => progress(done, total)
]);
allProgressPromise.onProgress(console.log);
//=> 1, 100
//=> 2, 100
//=> ...
//=> 100, 100

In case any of them don't pass the total in each progress(done, total) call, you'd get:

const allProgressPromise = PProgress.all([
	thisPromiseIs1(),  // always calls => progress(done, total)
	thisPromiseIs99(), // always calls => progress(done)
]);
allProgressPromise.onProgress(console.log);
//=> 1, undefined
//=> 2, undefined
//=> ...
//=> 100, undefined

In case they both have totals, and you add a raw Promise() to the mix, it'll default to 0/1 done/total progress and you'll get:

const allProgressPromise = PProgress.all([
	thisPromiseIs1(),  // always calls => progress(done, total)
	thisPromiseIs99(), // always calls => progress(done, total)
	Promise.resolve(), // doesn't call, defaults to 0/1, and goes 1/1 on resolve
]);
allProgressPromise.onProgress(console.log);
//=> 1, 101
//=> 2, 101
//=> ...
//=> 101, 101

In case any of the PProgress promises report undefined as total:

const allProgressPromise = PProgress.all([
	thisPromiseIs1(),  // always calls => progress(done, total)
	thisPromiseIs99(), // always calls => progress(done)
	Promise.resolve(), // doesn't call, defaults to 0/1, and goes 1/1 on resolve
]);
allProgressPromise.onProgress(console.log);
//=> 1, undefined
//=> 2, undefined
//=> ...
//=> 101, undefined

I think this covers all possible cases for PProgress.all().

@fregante
Copy link

fregante commented Sep 12, 2017

So the value of total for each PPromise depends on each executor to call the progress() function at least once (which might not happen if it's too fast and totally depends on the executor's discipline)

Also this makes the progress meaningless until all the totals are known. Look at your last example. What does onProgress's callback do with that 1, 2, ..., 101 information? The total is missing. The progress is meaningless because it doesn't know when it ends.

@darsain
Copy link
Author

darsain commented Sep 12, 2017

which might not happen if it's too fast and totally depends on the executor's discipline

I'd argue that if you are creating a PProgress instance, you have to call progress at least once before resolving, otherwise what is the point? But even if not, I see that current implementation is already forcing progress(1) call when there was no progress() call before resolving. The same thing happens in my PProgress.all() consolidation, which will just treat it as a raw Promise and simulate 1/1 progress call. So no real issue here.

Also this makes the progress meaningless until all the totals are known

How so? All promises default to 0/1, and if not updated before resolving update manually to 1/1, which is still a valid and usable progress state in all examples above.

What does onProgress's callback do with that 1, 2, ..., 101 information? The total is missing. The progress is meaningless because it doesn't know when it ends.

It's absolutely not meaningless. That is one of the main points of this whole issue, to also provide a progress handling in cases where total is not known. The 2nd and last examples without total values are behaviors that are going to be expected by the implementation, since you know what kind of promises you are throwing into PProgress.all(). In this case, instead of progress percentage, you might display progress done (size of data processed thus far), or a simple indeterminate progress bar, or both... completely in hands of the implementation. Currently it isn't even possible to handle progress jobs with no known total number. Or know precisely how much data has been processed.

And if you just want to use PProgress with these changes as you were until now, all you have to do is just switch slash / character with comma , in your PProgress resolving implementation, convert it into fraction in the progress callback, and absolutely nothing changes from the current behavior:

// before
progress(done / total);
promise.onProgress(progress => console.log(progress));
// after
progress(done, total);
promise.onProgress((done, total) => console.log(done / total));

All this proposal does is introduce a slight API change which gives move power, features, and options to both the progress implementors as well as the progress consumers.

@fregante
Copy link

Sorry I ended up on this issue again 😅

provide a progress handling in cases where total is not known

simple indeterminate progress bar,

It sounds to me that you just need a Promise that calls your callback, some sort of reporter.

The only thing between you and your objective are the limitations that p-progress puts in place:

p-progress/index.js

Lines 70 to 84 in a24163f

if (progress > 1 || progress < 0) {
throw new TypeError('The progress percentage should be a number between 0 and 1');
}
(async () => {
// We wait for the next microtask tick so `super` is called before we use `this`
await Promise.resolve();
if (progress === this._progress) {
return;
}
if (progress < this._progress) {
throw new Error('The progress percentage can\'t be lower than the last progress event');
}

I see two solutions:

  • add an allowAnyProgress option, which would let you pass anything to the callback
  • create a p-report module that does this by default.

The whole module is setup in a way that progress:1 means done. Allowing what you're asking requires a rewrite and probably it isn't even possible to achieve while maintaining the current behavior.

@darsain
Copy link
Author

darsain commented Oct 30, 2019

All the arguments for my case are already written above. In summary, the current API is inaccurate, limited, and regarding indeterminate progress reporting impossible.

I already forked this 2 years ago, so if you really don't see value in my proposal and are keeping this issue open just for me, feel free to close it.

it isn't even possible to achieve while maintaining the current behavior.

That's what major bumps are for.

@sindresorhus
Copy link
Owner

sindresorhus commented Feb 16, 2020

I somehow managed to miss this discussion in 2017... @darsain Your arguments makes total sense. I just made p-progress back then for my narrow use-case without considering more flexibility.

If I had done p-progress today, I think I would have done something closer to what you're proposing:

const PProgress = require('p-progress');

const progressPromise = new PProgress((resolve, reject, progress) => {
	const job = new Job();

	// It's optional to set this.
	progress.totalCount = job.getSize();

	job.on('data', data => {
		progress.completedCount += data.length;
	});

	job.on('finish', resolve);
	job.on('error', reject);
});

(async () => {
	progressPromise.onProgress(progress => {
		console.log(progress);
		/*
		{
			completedCount: 20,
			totalCount: 100,
			fractionCompleted: 0.2
		}
		*/
		/*
		If `totalCount` is not set, it and `fractionCompleted` would just be `0`.

		`totalCount` can be set at any time inside `new PProgress`.
		*/
	});

	await progressPromise;
})();

Just a quick braindump. The benefit of properties over methods is that they can also be read by the code inside new PProgress. Happy to hear anyone's thought.

And if you like the current behavior, you would still be able to do it by setting totalCount to 1 and just set completedCount to what you used to the percentage that you currently do.

@darsain
Copy link
Author

darsain commented Feb 17, 2020

Yes, passing the data on an object is definitely a better idea, but can you please simplify the property names? At least dropping the Count suffix. I'd suffer writing the verbose ones from the example. 😄

@sindresorhus
Copy link
Owner

Hah, sure. I initially had it as totalUnitCount and completedUnitCount 😝

@Richienb
Copy link
Contributor

If totalCount is decremented, would this mean the progress would go backwards?

@sindresorhus
Copy link
Owner

@Richienb Correct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants