Skip to content

chore: download browser downloads in-memory #35909

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ declare namespace extract {
}

declare function extract(
zipPath: string,
zipPath: Buffer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have numbers of how much do we save? I guess it'd be the most beneficial on Windows, but by how much? Also this means holding entire buffers in memory which might hit us if we finally decide to download in parallel. So I'm a bit hesitant changing this.

Copy link
Member Author

@mxschmitt mxschmitt May 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a M1 Pro Mac machine its a 9.55 % improvement and on a Windows DevBox its a 11.71 % improvement. Both of these tests were made on a fast machine and using a SSD. I expect much higher improvements when we consider slow (e.g. HDD) or busy disks.

re the comment for parallel downloads. If we assume to hold all our downloads in memory it would be 293 MB on Windows which would be still manageable I'd say. This aligns well of how much Chromium takes when navigating to websites. Its even higher sometimes on busy websites.

Used data

macOS M1 Pro

with file write:

test chromium-win64.zip via file attempt 1: 1.472s
test chromium-win64.zip via file attempt 2: 1.455s
test chromium-win64.zip via file attempt 3: 1.445s
test webkit-ubuntu-24.04.zip via file attempt 1: 968.385ms
test webkit-ubuntu-24.04.zip via file attempt 2: 1.017s
test webkit-ubuntu-24.04.zip via file attempt 3: 1.065s

with in-memory:

test chromium-win64.zip in memory attempt 1: 1.332s
test chromium-win64.zip in memory attempt 2: 1.391s
test chromium-win64.zip in memory attempt 3: 1.273s
test webkit-ubuntu-24.04.zip in memory attempt 1: 966.896ms
test webkit-ubuntu-24.04.zip in memory attempt 2: 906.812ms
test webkit-ubuntu-24.04.zip in memory attempt 3: 912.546ms
((1.472 + 1.455 + 1.445) / (1.332 + 1.391 + 1.273) - 1) * 100 = 9.41 % faster with chromium-win64.zip
((0.968 + 1.017 + 1.065) / (0.966 + 0.906 + 0.912) - 1) * 100 = 9.55 % faster with webkit-ubuntu-24.04.zip

Windows DevBox 64GB RAM

with file write:

PS C:\Users\maxschmitt\Developer\playwright> node test.mjs
test chromium-win64.zip via file attempt 1: 3.575s
test chromium-win64.zip via file attempt 2: 3.529s
test chromium-win64.zip via file attempt 3: 3.499s
test webkit-ubuntu-24.04.zip via file attempt 1: 2.423s
test webkit-ubuntu-24.04.zip via file attempt 2: 2.421s
test webkit-ubuntu-24.04.zip via file attempt 3: 2.310s

with in-memory:

PS C:\Users\maxschmitt\Developer\playwright> node test.mjs
test chromium-win64.zip in memory attempt 1: 3.234s
test chromium-win64.zip in memory attempt 2: 3.042s
test chromium-win64.zip in memory attempt 3: 3.240s
test webkit-ubuntu-24.04.zip in memory attempt 1: 2.051s
test webkit-ubuntu-24.04.zip in memory attempt 2: 2.140s
test webkit-ubuntu-24.04.zip in memory attempt 3: 2.213s
PS C:\Users\maxschmitt\Developer\playwright>

results

((3.575 + 3.529 + 3.499) / (3.234 + 3.042 + 3.240) - 1) * 100 = 11.42 % faster with chromium-win64.zip
((2.423 + 2.421 + 2.310) / (2.051 + 2.140 + 2.213) - 1) * 100 = 11.71 % faster with webkit-ubuntu-24.04.zip
Script
import os from 'node:os';
import fs from 'node:fs';
import path from 'node:path';

import extractZip from './packages/playwright-core/bundles/zip/src/third_party/extract-zip.js';

const testcases = [
  'https://playwright.azureedge.net/dbazure/download/playwright/builds/chromium/1172/chromium-win64.zip',
  'https://playwright.azureedge.net/dbazure/download/playwright/builds/webkit/2169/webkit-ubuntu-24.04.zip',
]
const inMemory = true;

const testDir = path.join(os.tmpdir(), 'playwright-extract-zip-test');

for (const url of testcases) {
  const file = Buffer.from(await fetch(url).then(res => res.arrayBuffer()));
  for (let i = 0; i < 3; i++) {
    if (fs.existsSync(testDir))
      fs.rmSync(testDir, { recursive: true, force: true });
    fs.mkdirSync(testDir);
    const testTitle = `test ${path.basename(url)} ${inMemory ? 'in memory' : 'via file'} attempt ${i + 1}`;
    if (inMemory) {
      console.time(testTitle);
      await extractZip(file, { dir: testDir });
    } else {
      console.time(testTitle);
      const filePath = path.join(testDir, 'chromium.zip');
      // Intentionally having the writeFileSync included in the time measurement
      // to see if it affects the performance.
      fs.writeFileSync(filePath, file);
      await extractZip(filePath, { dir: testDir });
    }
    console.timeEnd(testTitle);
  }
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds plausible! This is on the critical path, so we should place this behind an opt-out flag.

opts: extract.Options,
): Promise<void>;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,20 +33,21 @@ const { promisify } = require('util')
const stream = require('stream')
const yauzl = require('yauzl')

const openZip = promisify(yauzl.open)
const openZipFromBuffer = promisify(yauzl.fromBuffer)
const pipeline = promisify(stream.pipeline)

class Extractor {
constructor (zipPath, opts) {
this.zipPath = zipPath
constructor (zipFile, opts) {
/** @type {Buffer} */
this.zipFile = zipFile
this.opts = opts
}

async extract () {
debug('opening', this.zipPath, 'with opts', this.opts)
debug('opening', `${this.zipFile.byteLength} bytes`, 'with opts', this.opts);

this.zipfile = await openZip(this.zipPath, { lazyEntries: true })
this.canceled = false
this.zipfile = await openZipFromBuffer(this.zipFile, { lazyEntries: true });
this.canceled = false;

return new Promise((resolve, reject) => {
this.zipfile.on('error', err => {
Expand All @@ -55,7 +56,7 @@ class Extractor {
})
this.zipfile.readEntry()

this.zipfile.on('close', () => {
this.zipfile.on('end', () => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (!this.canceled) {
debug('zip extraction complete')
resolve()
Expand Down Expand Up @@ -186,7 +187,7 @@ class Extractor {
}
}

module.exports = async function (zipPath, opts) {
module.exports = async function (zipFile, opts) {
debug('creating target directory', opts.dir)

if (!path.isAbsolute(opts.dir)) {
Expand All @@ -195,5 +196,5 @@ module.exports = async function (zipPath, opts) {

await fs.mkdir(opts.dir, { recursive: true })
opts.dir = await fs.realpath(opts.dir)
return new Extractor(zipPath, opts).extract()
return new Extractor(zipFile, opts).extract()
}
14 changes: 3 additions & 11 deletions packages/playwright-core/src/server/registry/browserFetcher.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@

import * as childProcess from 'child_process';
import fs from 'fs';
import os from 'os';
import path from 'path';

import { debugLogger } from '../utils/debugLogger';
Expand All @@ -37,20 +36,17 @@ export async function downloadBrowserWithProgressBar(title: string, browserDirec
return false;
}

const zipPath = path.join(os.tmpdir(), downloadFileName);
try {
const retryCount = 5;
for (let attempt = 1; attempt <= retryCount; ++attempt) {
debugLogger.log('install', `downloading ${title} - attempt #${attempt}`);
const url = downloadURLs[(attempt - 1) % downloadURLs.length];
logPolitely(`Downloading ${title}` + colors.dim(` from ${url}`));
const { error } = await downloadBrowserWithProgressBarOutOfProcess(title, browserDirectory, url, zipPath, executablePath, downloadSocketTimeout);
const { error } = await downloadBrowserWithProgressBarOutOfProcess(title, browserDirectory, url, executablePath, downloadSocketTimeout);
if (!error) {
debugLogger.log('install', `SUCCESS installing ${title}`);
break;
}
if (await existsAsync(zipPath))
await fs.promises.unlink(zipPath);
if (await existsAsync(browserDirectory))
await fs.promises.rmdir(browserDirectory, { recursive: true });
const errorMessage = error?.message || '';
Expand All @@ -62,9 +58,6 @@ export async function downloadBrowserWithProgressBar(title: string, browserDirec
debugLogger.log('install', `FAILED installation ${title} with error: ${e}`);
process.exitCode = 1;
throw e;
} finally {
if (await existsAsync(zipPath))
await fs.promises.unlink(zipPath);
}
logPolitely(`${title} downloaded to ${browserDirectory}`);
return true;
Expand All @@ -75,7 +68,7 @@ export async function downloadBrowserWithProgressBar(title: string, browserDirec
* Thats why we execute it in a separate process and check manually if the destination file exists.
* https://github.com/microsoft/playwright/issues/17394
*/
function downloadBrowserWithProgressBarOutOfProcess(title: string, browserDirectory: string, url: string, zipPath: string, executablePath: string | undefined, socketTimeout: number): Promise<{ error: Error | null }> {
function downloadBrowserWithProgressBarOutOfProcess(title: string, browserDirectory: string, url: string, executablePath: string | undefined, socketTimeout: number): Promise<{ error: Error | null }> {
const cp = childProcess.fork(path.join(__dirname, 'oopDownloadBrowserMain.js'));
const promise = new ManualPromise<{ error: Error | null }>();
const progress = getDownloadProgress();
Expand All @@ -101,12 +94,11 @@ function downloadBrowserWithProgressBarOutOfProcess(title: string, browserDirect

debugLogger.log('install', `running download:`);
debugLogger.log('install', `-- from url: ${url}`);
debugLogger.log('install', `-- to location: ${zipPath}`);
debugLogger.log('install', `-- to location: ${browserDirectory}`);
const downloadParams: DownloadParams = {
title,
browserDirectory,
url,
zipPath,
executablePath,
socketTimeout,
userAgent: getUserAgent(),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ export type DownloadParams = {
title: string;
browserDirectory: string;
url: string;
zipPath: string;
executablePath: string | undefined;
socketTimeout: number;
userAgent: string;
Expand All @@ -43,11 +42,11 @@ function browserDirectoryToMarkerFilePath(browserDirectory: string): string {
return path.join(browserDirectory, 'INSTALLATION_COMPLETE');
}

function downloadFile(options: DownloadParams): Promise<void> {
function downloadFile(options: DownloadParams): Promise<Buffer> {
let downloadedBytes = 0;
let totalBytes = 0;

const promise = new ManualPromise<void>();
const promise = new ManualPromise<Buffer>();

httpRequest({
url: options.url,
Expand All @@ -73,21 +72,22 @@ function downloadFile(options: DownloadParams): Promise<void> {
}
totalBytes = parseInt(response.headers['content-length'] || '0', 10);
log(`-- total bytes: ${totalBytes}`);
const file = fs.createWriteStream(options.zipPath);
file.on('finish', () => {
const chunks: Buffer[] = [];
response.on('end', () => {
if (downloadedBytes !== totalBytes) {
log(`-- download failed, size mismatch: ${downloadedBytes} != ${totalBytes}`);
promise.reject(new Error(`Download failed: size mismatch, file size: ${downloadedBytes}, expected size: ${totalBytes} URL: ${options.url}`));
} else {
log(`-- download complete, size: ${downloadedBytes}`);
promise.resolve();
promise.resolve(Buffer.concat(chunks));
}
});
file.on('error', error => promise.reject(error));
response.pipe(file);
response.on('data', onData);
response.on('data', chunk => {
chunks.push(chunk);
downloadedBytes += chunk.length;
progress(downloadedBytes, totalBytes);
});
response.on('error', (error: any) => {
file.close();
if (error?.code === 'ECONNRESET') {
log(`-- download failed, server closed connection`);
promise.reject(new Error(`Download failed: server closed connection. URL: ${options.url}`));
Expand All @@ -98,18 +98,13 @@ function downloadFile(options: DownloadParams): Promise<void> {
});
}, (error: any) => promise.reject(error));
return promise;

function onData(chunk: string) {
downloadedBytes += chunk.length;
progress(downloadedBytes, totalBytes);
}
}

async function main(options: DownloadParams) {
await downloadFile(options);
const zipFile = await downloadFile(options);
log(`SUCCESS downloading ${options.title}`);
log(`extracting archive`);
await extract(options.zipPath, { dir: options.browserDirectory });
await extract(zipFile, { dir: options.browserDirectory });
if (options.executablePath) {
log(`fixing permissions at ${options.executablePath}`);
await fs.promises.chmod(options.executablePath, 0o755);
Expand Down
2 changes: 1 addition & 1 deletion tests/library/browsercontext-har.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -283,7 +283,7 @@ it('should round-trip extracted har.zip', async ({ contextFactory, server }, tes
await context1.close();

const harDir = testInfo.outputPath('hardir');
await extractZip(harPath, { dir: harDir });
await extractZip(await fs.promises.readFile(harPath), { dir: harDir });

const context2 = await contextFactory();
await context2.routeFromHAR(path.join(harDir, 'har.har'));
Expand Down
2 changes: 1 addition & 1 deletion tests/playwright-test/reporter-blob.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1438,7 +1438,7 @@ test('blob report should include version', async ({ runInlineTest }) => {
});

async function extractReport(reportZipFile: string, unzippedReportDir: string): Promise<any[]> {
await extractZip(reportZipFile, { dir: unzippedReportDir });
await extractZip(await fs.promises.readFile(reportZipFile), { dir: unzippedReportDir });
const reportFile = path.join(unzippedReportDir, 'report.jsonl');
const data = await fs.promises.readFile(reportFile, 'utf8');
const events = data.split('\n').filter(Boolean).map(line => JSON.parse(line));
Expand Down
Loading