Ideas for PDF Handling
Summary
- There are 12 posts — by 4 authors — in this topic.
- Latest post made by Michael JasonSmith at 2009 Nov 05 22:54 UTC
Below are a few ideas for PDF handling. They are not completely thought out,
but I wanted to record them for later reference ☺
*Thumbnails*
Rather than just linking to the PDF, we could provide a thumbnail
of the first page, providing a visual queue to document
retrieval. The thumbnails can be used anywhere that the image
thumbnails are used.
*Page* *Fan*
When the user clicks on an arrow to the right of the thumbnail
the image will fan-out to show the first half-dozen pages. This
will allow the user gather more information about the document.
Maybe each page can be zoomed, so the user can quickly see if
the document is the one being sought.
*Keywords*
Like the keywords for topics, keywords could be extracted from
PDFs and reproduced as metadata in the search results. This would
add information scent to search.
*Automatic* *Conversion*
Open Document Format and Microsoft Office documents to be
converted to PDF automatically, using OpenOffice and the CUPS PDF
printer from the command line. This would allow office documents
to have thumbnails too!
On Thu, Nov 5, 2009 at 12:12 AM, <email obscured>> wrote: > *Thumbnails* > Rather than just linking to the PDF, we could provide a thumbnail > of the first page, providing a visual queue to document > retrieval. The thumbnails can be used anywhere that the image > thumbnails are used. With the danger that someone right-clicks and saves the thumbnail image, rather than the think it links to? (Probably not a big deal) > *Page* *Fan* > When the user clicks on an arrow to the right of the thumbnail > the image will fan-out to show the first half-dozen pages. This > will allow the user gather more information about the document. > Maybe each page can be zoomed, so the user can quickly see if > the document is the one being sought. Sounds nice, but as with all these things, be careful that the preview doesn't involve transferring as much data to the client as the actual document itself > *Automatic* *Conversion* > Open Document Format and Microsoft Office documents to be > converted to PDF automatically, using OpenOffice and the CUPS PDF > printer from the command line. This would allow office documents > to have thumbnails too! As long as you aren't *replacing* the original ODF or Word document, yes ... this would be very useful. But it may be 'better' to leave the original in place, and have a link to offer the PDF to the user, rather than letting them think that the original sender sent a PDF
-jim
On Thu, 2009-11-05 at 17:53 +1300, Jim Cheetham wrote: > On Thu, Nov 5, 2009 at 12:12 AM, <email obscured>> wrote: > > *Thumbnails* > > Rather than just linking to the PDF, we could provide a thumbnail > > of the first page, providing a visual queue to document > > retrieval. The thumbnails can be used anywhere that the image > > thumbnails are used. > > With the danger that someone right-clicks and saves the thumbnail > image, rather than the think it links to? > (Probably not a big deal) I'd go so far as to say it is 'extremely unlikely', given the cross over between the people likely to do that, and the people who know that you *can* right click to save images is probably pretty small ;) > > *Page* *Fan* > > When the user clicks on an arrow to the right of the thumbnail > > the image will fan-out to show the first half-dozen pages. This > > will allow the user gather more information about the document. > > Maybe each page can be zoomed, so the user can quickly see if > > the document is the one being sought. > > Sounds nice, but as with all these things, be careful that the preview > doesn't involve transferring as much data to the client as the actual > document itself That was one of my thoughts too. Could be nice eye candy though ;) > > *Automatic* *Conversion* > > Open Document Format and Microsoft Office documents to be > > converted to PDF automatically, using OpenOffice and the CUPS PDF > > printer from the command line. This would allow office documents > > to have thumbnails too! > > As long as you aren't *replacing* the original ODF or Word document, > yes ... this would be very useful. But it may be 'better' to leave the > original in place, and have a link to offer the PDF to the user, > rather than letting them think that the original sender sent a PDF Yes, good idea. I'm not sure how much value there is in thumbnails of documents ... unless the thumbnails are quite large?
--Richard
Thanks for your advice and positive response, Jim! I am with Richard: people who know that you can right-click to save an image will also know enough not to get mixed up with the link. However, if I present the thumbnail as a background image (like the logotype on <http://onlinegroups.net/s/>) then people really cannot get mixed up ☺ I forgot to mention that my idea for the page-fan is to show half a dozen thumbnails (at 87×124px for an A4 document). Sorry for not being clear. What I would probably do is create an image that contains all of the first six pages. This will allow me to send a single 522×124px image, so fewer GET requests would be made and compression would be better. I would use a variant of CSS sprites to split the image into separate pages. I was going for eye-candy was with the page-fan, and I am glad you like it, Richard ☺ Jim, I was going to keep the original Office documents after automatically converting them to PDFs. My original intention was to feed the automatically generated PDF into the thumbnail system; I had not decided to even link the generated PDF, but if you would find it useful I am sold! In my own experience thumbnails are useful as cues to retrieval, even at small sizes (keeping in mind that I am a designer who does not necessaryly reflect the majority). Below is a screenshot of part of my desktop, which has 76×100px thumbnails of PDF documents. Many stand out from the crowd, and can be recognised even at a small size. Documents created by larger organisations, like councils and government departments, often have colourful title pages — even if documents that I produce may not be colourful (such as my thesis). We could allow some zoom function on a thumbnail, maybe showing a 200×282px image, to help the user figure out if the document should be downloaded.
The following file was added to this topic:
> Below is a screenshot of part of my > desktop, which has 76×100px thumbnails of PDF documents. Many stand out > from the crowd, and can be recognised even at a small size. Documents > created by larger organisations, like councils and government > departments, often have colourful title pages The one I most want to read has the dullest thumbnail: 'we-had-to-hang-the-bicycle.pdf'. But my interests do not necessarily reflect the majority :).
On Fri, Nov 6, 2009 at 12:14 AM, Michael JasonSmith <email obscured>> wrote: > Jim, I was going to keep the original Office documents after > automatically converting them to PDFs. My original intention was to feed > the automatically generated PDF into the thumbnail system; I had not > decided to even link the generated PDF, but if you would find it useful > I am sold! Well, I was just thinking that if you're going to the CPU effort to create a PDF when the document comes in, you might as well offer the final version for people that don't have the original format's software installed. However, if all you're doing is presenting thumbnails, why PDF? Why not a simple PNG image? There will probably be better client support for that, in general ... and by talking about them as if they are PDFs, you're bringing in an expectation. On your desktop, those thumbnails aren't PDFs, they're PNGs with a link to a PDF original (actually on OS X I think all the screen is PDF or at least PS ... which is interesting)
-jim
On Fri, 2009-11-06 at 10:26 +1300, Jim Cheetham wrote: > However, if all you're doing is presenting thumbnails, why PDF? Why > not a simple PNG image? There will probably be better client support > for that, in general ... and by talking about them as if they are > PDFs, you're bringing in an expectation. On your desktop, those > thumbnails aren't PDFs, they're PNGs with a link to a PDF original > (actually on OS X I think all the screen is PDF or at least PS ... > which is interesting) I think what Mike meant was that we have the technology to do thumbnails on PDFs, but not on Office docs, and we do have the technology to translate Office into PDF. Unless I misunderstood.
--Richard
Dan, “We had to Hang the Bicycle” is a three-page article on bicycles used for creating terror; I have no idea where it is from. I have placed the article below, along with the first page rendered as a PNG. The image will give some idea of what a thumbnail would look like on GroupServer ☺
The following files were added to this topic:
Richard, you did not misunderstand me; I communicated poorly. Sorry, Jim.
On Fri, Nov 6, 2009 at 10:54 AM, <email obscured>> wrote: > Richard, you did not misunderstand me; I communicated poorly. Sorry, Jim. No, I think we're all talking about the same thing. If an ODF comes in, you cannot generate thumbnails directly; you need to create a PDF of it, and to generate thumbnails of that. This meant two things to me; one was the possibility that you could make that PDF available for people that cannot read ODF natively (hence the extra link, and the concern that the UI might make readers think the original writer had made the PDF) The second was terminology-related; you were talking about the PDFs being thumbnails, which I had probably misinterpreted :-) I think it's a great idea overall, and helps to remove the problems caused when one member of a group "upgrades" to the latest MS Office version and starts distributing files that the other members cannot read. I'm not a fan of PDF being any form of long-term storage option, but for these purposes it sounds very handy.
-jim
On Fri, 2009-11-06 at 11:30 +1300, Jim Cheetham wrote: > I think it's a great idea overall, and helps to remove the problems > caused when one member of a group "upgrades" to the latest MS Office > version and starts distributing files that the other members cannot > read. I'm not a fan of PDF being any form of long-term storage option, > but for these purposes it sounds very handy. At the expense of me offering to triple our storage requirements, we could convert to HTML at the same time, for a really 'quick read'. We effectively have to do a plain text conversion for indexing anyway.
--Richard
I mused about converting the PDF documents HTML. However, I have never been happy with the PDF documents that Google has automatically converted to HTML: the resulting HTML does not read like either a Web page or a PDF. I suspect that Google originally converted the PDFs to HTML in order to index them, for much the same reason as you propose converting them, Richard ☺
Loading…
Privacy | Acceptable Use | Terms of Service | About OnlineGroups.Net | Contact OnlineGroups.Net
Start an OnlineGroups.Net site for easier email collaboration in your organization.
Powered by GroupServer, the open source web-based mailing list manager.
