Skip to content

Future exception was never retrieved when closing page waiting for download event #823

@tcrs

Description

@tcrs

I'm using playwright to download PDFs of URLs from RSS feeds. Some of the URLs are actually links to PDFs (mixed with links to "normal" webpages), and I'd like to handle that by downloading the PDFs. I have an implementation which works, I've included a minimal(ish) version below which accepts a URL and a filename to write it to. You can try for example (where script.py contains the code below):

Convert a web page to PDF: python3 script.py https://arxiv.org/abs/1912.11035 a.pdf
Download a PDF: python3 script.py https://arxiv.org/pdf/1912.11035 b.pdf

The first example (converting a webpage to a PDF) outputs this:

goto success: https://arxiv.org/abs/1912.11035
download exception: https://arxiv.org/abs/1912.11035: Page closed
Future exception was never retrieved
future: <Future finished exception=Error('Target page, context or browser has been closed')>
playwright._impl._api_types.Error: Target page, context or browser has been closed

I can't figure out how to stop the "Future exception was never retrieved" warning being printed. As you can see the "Page closed" exception has been caught in the exception handler for await download_task.

Am I doing something wrong? Or is this an issue in the playwright code?

import sys
import asyncio
from playwright.async_api import async_playwright

async def download(url, filename):
    async with async_playwright() as p:
        browser = await p.chromium.launch()
        context = await browser.new_context(accept_downloads = True, java_script_enabled = False)
        page = await context.new_page()

        download_task = asyncio.create_task(page.wait_for_event('download'))
        goto_task = asyncio.create_task(page.goto(url, wait_until='networkidle'))
        try:
            await goto_task
            await page.pdf(path=filename)
            print('goto success: ' + url)
            await page.close()
            success = True
        except Exception as e:
            print('goto exception: {}: {}'.format(url, e))

        try:
            download = await download_task
            await download.save_as(filename)
            print('download success: ' + url)
            await page.close()
            success = True
        except Exception as e:
            print('download exception: {}: {}'.format(url, e))

        if not success:
            await page.close()

if __name__ == '__main__':
    asyncio.run(download(sys.argv[1], sys.argv[2]))

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions