Opened 8 years ago

Closed 5 years ago

Last modified 5 years ago

#2833 closed enhancement (fixed)

add support for PDF export

Reported by: godiard Owned by: sascha_silbe
Priority: Normal Milestone: Unspecified
Component: Browse Version: Git as of bugdate
Severity: Unspecified Keywords:
Cc: sascha_silbe, manuq Distribution/OS:
Bug Status: Resolved

Description

Paola Bruccoleri from Uruguay has reported a solution using a web site:

...usé: http://www.web2pdfconvert.com/

entonces.. en el archivo webtoolbar.py agregué:

el botón:
       self._savepdf = ToolButton('filesave')
       self._savepdf.set_tooltip(_('Save PDF'))
       self._savepdf.connect('clicked', self._savepdf_clicked)
       self.insert(self._savepdf, -1)
       self._savepdf.show()

la función:
   def _savepdf_clicked(self, button):
       uri = self._browser.web_navigation.currentURI
       dir = 'http://www.web2pdfconvert.com/?u='+uri
       dir = 'http://html-pdf-converter.com/en/convert?u='+uri
       self._browser.load_uri(dir)
       self._browser.grab_focus()

Change History (21)

comment:1 Changed 8 years ago by greenfeld

I would not recommend using a solution which depends on a web site; we will never know if we can access it or not, if the site may go down, start sticking advertisements in files, etc.

If we integrate such functionality using an Internet system, we also probably should get permission from the site's owner.

Firefox in Linux can save natively to a PDF file. Can we access that from libxulrunner?

comment:2 Changed 8 years ago by godiard

I agree. I think the right solution is use gecko to print to a pdf file. I have added the code from the teacher only for reference,

comment:3 Changed 8 years ago by godiard

This information can be useful

http://www.geckofx.org/viewtopic.php?id=980

comment:4 Changed 8 years ago by sascha_silbe

  • Bug Status changed from Unconfirmed to New
  • Distribution/OS Unspecified deleted
  • Summary changed from Browse must be save a page to PDF for offline reading. to add support for PDF export
  • Version changed from Unspecified to Git as of bugdate

For reading web pages offline there are other, potentially better solutions (e.g. Lucians Webified work, wwwoffle). Nevertheless I took a stab at implementing PDF support.

This is the code I added to browser.Browser:

    def export_pdf(self):
        cls = components.classes['@mozilla.org/gfx/printsettings-service;1']
        setting_service = cls.getService(interfaces.nsIPrintSettingsService)
        req = self.get_dom_window().QueryInterface(interfaces.nsIInterfaceRequestor)
        print_iface = req.getInterface(interfaces.nsIWebBrowserPrint)

        temp_dir = os.path.join(activity.get_activity_root(), 'instance')
        pdf_file = tempfile.NamedTemporaryFile(suffix='.pdf', dir=temp_dir,
                                               delete=False)

        settings = setting_service.newPrintSettings
        settings.printSilent = True
        settings.printToFile = True
        settings.toFileName = pdf_file.name
        settings.outputFormat = interfaces.nsIPrintSettings.kOutputFormatPDF

#        print_iface.print(printSettings, None)
        getattr(print_iface, 'print')(printSettings, None)

It fails with the following error:

Traceback (most recent call last):
  File "/home/sascha.silbe/sugar-jhbuild/install/share/sugar/activities/Browse.activity/webtoolbar.py", line 494, in _export_pdf_cb
    browser.export_pdf()
  File "/home/sascha.silbe/sugar-jhbuild/install/share/sugar/activities/Browse.activity/browser.py", line 320, in export_pdf
    logging.debug('print: %r', getattr(print_iface, 'print'))
  File "/usr/lib/pymodules/python2.6/xpcom/client/__init__.py", line 374, in __getattr__
    return getattr(interface, attr)
  File "/usr/lib/pymodules/python2.6/xpcom/client/__init__.py", line 466, in __getattr__
    unbound_method = BuildMethod(method_info, self._iid_)
  File "/usr/lib/pymodules/python2.6/xpcom/client/__init__.py", line 125, in BuildMethod
    codeObject = compile(method_code, "<XPCOMObject method '%s'>" % (name,), "exec")
  File "<XPCOMObject method 'print'>", line 2
    def print(self, Param1, Param2):
            ^
SyntaxError: invalid syntax

The problem is that print is a reserved keyword in Python 2. XPCOM would need to rename print to something else, e.g. print_. I'm not sure how feasible it is to get that fix rolled out to the distributions that still ship a python-xpcom we can use in the first place (Ubuntu doesn't ship it at all and recent versions of Fedora are broken, too).

comment:5 follow-ups: Changed 8 years ago by lucian

The way I see it, converting to PDF isn't ideal: 1) it's a lossy conversion and 2) it'll be opened by a different activity.

As Sacha says, in Webified there's a proof of concept for saving complete web pages (just like firefox), putting them in a zip and later opening them with Browse:
http://git.sugarlabs.org/browse/webified/commit/25450e1f18701402eb3bdba3a7c3297f61ac752a
http://git.sugarlabs.org/browse/webified/commit/7005fb13a31f26704ec561d8e71bb4f82274fb3c

I don't have time for it now, but it shouldn't be hard to port it to Browse mainline. The main problem with the patch(es) as it is now is that it doesn't clean up after it correctly.

comment:6 in reply to: ↑ 5 Changed 8 years ago by garycmartin

Replying to lucian:

The way I see it, converting to PDF isn't ideal: 1) it's a lossy conversion and 2) it'll be opened by a different activity.

As Sacha says, in Webified there's a proof of concept for saving complete web pages (just like firefox), putting them in a zip and later opening them with Browse:
http://git.sugarlabs.org/browse/webified/commit/25450e1f18701402eb3bdba3a7c3297f61ac752a
http://git.sugarlabs.org/browse/webified/commit/7005fb13a31f26704ec561d8e71bb4f82274fb3c

I don't have time for it now, but it shouldn't be hard to port it to Browse mainline. The main problem with the patch(es) as it is now is that it doesn't clean up after it correctly.

...support for zipped up web pages would make a _very_ useful way to quickly provide/distribute browser compatible content (and keep it in a format that could be edited and re-shared).

comment:7 in reply to: ↑ 5 ; follow-up: Changed 8 years ago by godiard

Replying to lucian:

The way I see it, converting to PDF isn't ideal: 1) it's a lossy conversion and 2) it'll be opened by a different activity.

As Sacha says, in Webified there's a proof of concept for saving complete web pages (just like firefox), putting them in a zip and later opening them with Browse:
http://git.sugarlabs.org/browse/webified/commit/25450e1f18701402eb3bdba3a7c3297f61ac752a
http://git.sugarlabs.org/browse/webified/commit/7005fb13a31f26704ec561d8e71bb4f82274fb3c

I don't have time for it now, but it shouldn't be hard to port it to Browse mainline. The main problem with the patch(es) as it is now is that it doesn't clean up after it correctly.

Yeah, probably you are right, and I think firefox already can open zipped media (jar support)

comment:8 in reply to: ↑ 7 ; follow-up: Changed 8 years ago by lucian

Replying to godiard:

Replying to lucian:

The way I see it, converting to PDF isn't ideal: 1) it's a lossy conversion and 2) it'll be opened by a different activity.

As Sacha says, in Webified there's a proof of concept for saving complete web pages (just like firefox), putting them in a zip and later opening them with Browse:
http://git.sugarlabs.org/browse/webified/commit/25450e1f18701402eb3bdba3a7c3297f61ac752a
http://git.sugarlabs.org/browse/webified/commit/7005fb13a31f26704ec561d8e71bb4f82274fb3c

I don't have time for it now, but it shouldn't be hard to port it to Browse mainline. The main problem with the patch(es) as it is now is that it doesn't clean up after it correctly.

Yeah, probably you are right, and I think firefox already can open zipped media (jar support)

At first I'd tried jar:, but it turns out it doesn't work as expected. Mozilla didn't intend it to be used for anything other than extensions and because of certain security features, they aren't usable for this. Unzipping files in Browse is a better choice, and it's not a big issue.

Replying to sascha_silbe:

The problem is that print is a reserved keyword in Python 2. XPCOM would need to rename print to something else, e.g. print_. I'm not sure how feasible it is to get that fix rolled out to the distributions that still ship a python-xpcom we can use in the first place (Ubuntu doesn't ship it at all and recent versions of Fedora are broken, too).

For actually printing PDFs, a workaround like setattr(self, 'print', print_) should work.

comment:9 in reply to: ↑ 8 Changed 8 years ago by sascha_silbe

  • Cc sascha_silbe added

Replying to sascha_silbe:

The problem is that print is a reserved keyword in Python 2. XPCOM would need to rename print to something else, e.g. print_. I'm not sure how feasible it is to get that fix rolled out to the distributions that still ship a python-xpcom we can use in the first place (Ubuntu doesn't ship it at all and recent versions of Fedora are broken, too).

For actually printing PDFs, a workaround like setattr(self, 'print', print_) should work.

As expected, this still fails in getattr():

        setattr(print_iface, 'print_', getattr(print_iface, 'print'))
Traceback (most recent call last):
  File "/home/sascha.silbe/sugar-jhbuild/install/share/sugar/activities/Browse.activity/webtoolbar.py", line 506, in _export_pdf_cb
    browser.export_pdf()
  File "/home/sascha.silbe/sugar-jhbuild/install/share/sugar/activities/Browse.activity/browser.py", line 323, in export_pdf
    logging.debug('print: %r', getattr(print_iface, 'print'))
  File "/usr/lib/pymodules/python2.6/xpcom/client/__init__.py", line 374, in __getattr__
    return getattr(interface, attr)
  File "/usr/lib/pymodules/python2.6/xpcom/client/__init__.py", line 466, in __getattr__
    unbound_method = BuildMethod(method_info, self._iid_)
  File "/usr/lib/pymodules/python2.6/xpcom/client/__init__.py", line 125, in BuildMethod
    codeObject = compile(method_code, "<XPCOMObject method '%s'>" % (name,), "exec")
  File "<XPCOMObject method 'print'>", line 2
    def print(self, Param1, Param2):
            ^
SyntaxError: invalid syntax

The problem is that the code generated on-the-fly by python-xpcom uses the reserved keyword print.

comment:10 Changed 8 years ago by sascha_silbe

  • Owner changed from lucian to sascha_silbe
  • Status changed from new to accepted

I have reported the syntax error issue at Debian as Debian#628484 and found a hack to work around it.

comment:11 Changed 7 years ago by manuq

  • Cc manuq added

This is doable in current Browse that features WebKit.

comment:12 Changed 7 years ago by manuq

  • Milestone changed from Unspecified by Release Team to 0.96

comment:13 Changed 7 years ago by erikos

  • Milestone changed from 0.96 to 0.98

Not a regression, moving out at this point of the cycle.

comment:14 Changed 7 years ago by humitos

  • Priority changed from Unspecified by Maintainer to Normal

comment:15 Changed 7 years ago by manuq

This can be a good reference:

https://raw.github.com/potyl/Webkit/master/screenshot.pl

See the usage of Gtk.OffscreenWindow with the webkit view inside, and the cairo PdfSurface.

comment:16 Changed 7 years ago by manuq

  • Milestone changed from 0.98 to 1.0

Does not fit current goals, moving to 1.0 .

comment:17 Changed 6 years ago by dnarvaez

  • Milestone 1.0 deleted

Milestone 1.0 deleted

comment:18 Changed 5 years ago by ignacio

  • Milestone set to Unspecified

comment:19 Changed 5 years ago by ignacio

  • Resolution set to fixed
  • Status changed from accepted to closed

comment:20 Changed 5 years ago by ignacio

  • Bug Status changed from New to Resolved

comment:21 Changed 5 years ago by ignacio

  • Bug Status changed from New to Resolved
Note: See TracTickets for help on using tickets.