Home > Accessibility > Accessible PDF – Revisited

Accessible PDF – Revisited

September 17th, 2009 jeb Leave a comment Go to comments

Acrobat Reader logo

NOTE: 9/29/08 – I have added a few more resources about accessible PDFs at the end of this blog entry. If you find (or know of others) I will add them as well. Thanks to everyone who has commented.

I attended the Adobe Acrobat Users webinar a few weeks back and was pleased and satisfied that both what I have been doing, and what I have been advocating others to do, is the proper course of action.

This webinar did indeed introduce me to some of the more subtle nuances of Adobe Acrobat Professional that I was not aware of (although I am not sure if they are all part of the older version of Acrobat Professional that I own). But the dominant message – one that came across loud and clear – was the fervent appeal to create document that are accessible BEFORE converting them over to PDF (Portable Document Format).

In nearly all situations, an author considering the use of a PDF file will have created the original document in some other application. The exception might be a PDF “form” which one might create with Adobe LiveCycle Designer (not exactly Acrobat, but it comes with the Acrobat Pro package. Since MS-Word is the dominate player in this area, it is most likely that the PDF conversion will be from a Word document, but authors may be using MS Publisher, Adobe inDesign or any other document producing software.

I’ve already written about how to make accessible Word documents and other types of documents so I won’t repeat that information here. But I should note that there is a new White Paper from Adobe on Creating Accessible PDF with Adobe inDesign CS4 [PDF] that was just published.

The good news is that accessibly-designed document files will generally convert into accessible PDFs with almost no effort. But, the key here is that the original document has to be accessible first. And in most instances, the original document can very easily be made accessible by following some basic rules. Those rules fit into a nice acronym – H.I.T. The “H” stands for Headings, the “I” stands for Images, and the “T” stands for Tables. This is not to say that there aren’t other accessibility issues to be concerned about, but if the author attends to these three, they will be addressing the ones that often cause the most problems with users of Assistive Technology (AT).

Headings

Contrary to popular belief, the purpose of Headings in a document is not to make the font larger and more distinctive; the real reason is to create a semantic framework for understanding the relationship between and among the sections of content. The use of this semantic layout is essential for persons using AT.

When a person with a visual impairment, using a screen reader or some other AT device, reads a document, they most typically use the Headings to scan the document in exactly the same way a sighted person would scan it visually. Printers and typographers learned long ago that by changing the size, shape and spacing of the font, the reader can more easily semantically understand the organization of the document. The person with a visual impairment uses the hierarchical order of the Headings to semantically understand the document. Using a feature built into their screen reader, the user will simply jump from Heading to Heading to peruse the document. The hierarchical order of the headings cues the reader of the importance of the heading and the content that follows.

If you think of a typical textbook, the document starts with a title page that includes the name of the book and other identifying information (the name of the author, publisher, etc.). The most important information on that page is the title itself. For this reason, the title should always be Heading #1 and all other headings below this should be numbered Heading 2, 3, 4 and so on. While some will argue with me on this point, my general recommendation is to have only one Heading 1 in each document. My logic is that documents have only one Title.

In the typical textbook, there are usually a number of chapters and sub-chapters or sections. In our example, each of the chapter numbers and names would use Heading #2. Sub-chapters would then be styled with Heading #3 and sections within each would be Heading 4, 5, 6 as needed (Note: it is rare to see more that three or four sub headings in most documents). This is illustrated in Figure 1.

Figure 1

Figure 1

It is noted that different applications may call Headings by different names, but they all operate the same way. In Microsoft Word 2007, Headings can be found in the Styles section of the Ribbon. In Apple’s iWork Pages, the Headings elements are found in the Styles Drawer. And in Open Office Writer, the Heading can be found in the Styles drop-down bar.

Images

Images, whether they are on a web page, or in a word processed document, can present difficulties to many people using AT. Screen reading software, when encountering an image in a document, will announce the discovery by stating the word “image” followed by the alternative (or ALT) description provided by the author. Without the ALT description, the screen reader simply announces “image” leaving the user to guess what this means. This can be particularly problematic when the image in question is graphic text, that is, text embedded into an image such as in a logo. Even worse is when this image contains a hyperlink to some other resource. In these cases, without an ALT description, the screen reader user has to go to that new link to find out (or try to find out) what that resource is. It all makes for a rather confusing experience.

When creating web pages in HTML, the author is required to use ALT description for the image. But the author also has the option of using the “null” attribute – that is ALT=”" – which is a command to the screen reader to simply skip over the image completely. When creating other documents, whether they be word processed or PDFs, there is an option for adding a descriptive text to the image. However, unfortunately there is no capacity to make this a “null ALT” so all images must have a description.

As I have discussed in previous articles, most images in documents are simply “pretty pictures” designed to “catch one’s eye” and to make the overall document more visually appealing. They may be used as placeholders, to fill in white space, or to simply attenuate the topic of the writing. But in most cases, they add nothing to the understanding of the document. So choosing an ALT description for a PDF document can present some challenges. The general consensus among the designers I know is to try to keep ALT descriptions short and to the point. Here is a more thorough discussion on how to write good ALT descriptions.

Tables

Finally tables, or tabled data, in a document can present challenges to users of AT if the tables are not constructed correctly. To understand a table, the reader must understand the meaning of the data in each cell and this is typically accomplished by the use of column and/or row headings. Most tables use the top row of the columns for this heading information so most word processors software packages, when they create a table, will automatically assign this top row as the heading.

For example in Table 1, the first column contains the list of months; the second column the number of cars sold. A screen reader will read this as: Month, car sales, Jan, 67, Feb, 56, etc. In other words, the screen reader will read each cell starting in the upper left corner and read across the page to the right and then down to the next row.

In a large table with many rows and columns, a person using a screen reader could easily become lost in the data not knowing what row or column they are on. By the use of the “Table Mode” and special commands commonly found in most screen reader software, users are able to navigate around the table in various ways (e.g., reading columns or rows separately). But if the layout of the Table is not correct, the screen reader user can easily get lost in a sea of numbers and disconnected data.

Table 1.

Month Car Sales
Jan. 67
Feb. 56
Mar. 34
Apr. 67
May 86
Jun. 56
Jul. 44

Therefore, tabled information in documents should generally be kept as simple as possible and the author must ensure that the layout of tables is constructed in such a way as to make the information understandable to all users. If a large complex table is required, it is best practice to publish this on a separate page in the document (or on a separate webpage if an HTML document). Ideally, a complex data tables should be kept in a spreadsheet application (e.g., MS Excel) and sent along as a separate document.

Converting Documents

Converting documents into PDF format can be done by any number of conversion solutions. Perhaps the most robust converter is the Adobe Acrobat PDFMaker, a plug-in that comes with the Adobe Acrobat Professional suite. However, I have discovered that when using Microsoft Office 2007, the Office Add-in: Microsoft Save as PDF does a much better job of converting Office files with fewer errors and faster results.

If you are using Open Office 3.1, the application has a built-in “save as PDF” feature. However, my experiments with this feature showed mixed results with most converted PDF documents failing to pass the accessibility test.

Note: As of this writing, I have only been able to test Apple iWork08. Regretfully, documents made by this application cannot be made accessible. I have ordered iWork09 and will report on those results on my blog as soon as possible

Testing Documents

Before making any PDF document available to the public, it should always be tested thoroughly for accessibility using the Adobe Acrobat Professional. Apart from actually testing the document with a screen reader like JAWS, Acrobat Professional is the only application I am aware of that tests PDFs for accessibility. Not only will the Acrobat Professional accessibility application test the page, it will provide detailed instructions on how to remedy any errors that are reported. For details on using this feature on Acrobat Professional, please visit the Adobe website or the Acrobat Users website.

Resources

Previous article about Accessible Documents

Accessibility resources from Adobe

Acrobat Users website

Here are some more web-based articles about accessible PDFs:

  1. July 23rd, 2009 at 08:34 | #1

    It ought to be noted that the really only truly accessible means to convert a Microsoft Word Document to a truly accessible PDF is with Microsoft Word & PDFMaker on a PC. On any other platforms with other applications, the accessibility has to be built into the PDF ex post facto. I could be wrong, and if so, let me know.

    By the way, thanks for writing about PDF accessibility. Up until this year, most of what one could find has been folk lore like materials.

  2. July 23rd, 2009 at 08:35 | #2

    The Adobe Acrobat webinars are great learning experiences. I attended the webinar last month, and picked up some new info on accessible PDFs. I only wish the webinar had gone into more detail on complex tables.

    At the college I work at, we are slowly moving forward in re-publishing more accessible and usable PDFs.

    I’m curious about your comment that the “Office Add-in: Microsoft Save as PDF” does a better job of converting documents than the Acrobat PDFmaker plugin; can you provide more specific details?

    The Maine CITE page is a great compilation of information for creating accessible documents. I had not seen it before, and have bookmarked the page. If I might offer another resource, check out the the December 2008 post I wrote, “Ten Tips for Creating Usable and Accessible PDFs” at http://refresh-detroit.org/2008/12/02/ten-tips-for-creating-usable-and-accessible-pdfs/

  3. July 23rd, 2009 at 09:33 | #3

    Thanks for this article. The section on headings is very interesting. Just yesterday I tried to explain to some colleagues why headings are important when insuring accessibility.

    I agree with William about the availability of articles about PDF accessibility.

    I use the Office Add-in to convert files to PDF, but I always heavily tweak the tags.

    Have you had a chance to test either PAW or CommonLook from NetCentric? I have and found it to be easier to deal with tags myself than use those products, however I understand that folks in my company feel otherwise.

  4. July 23rd, 2009 at 09:51 | #4

    Not true that a PDF originated from Word is automatically accessible, also with tag and in presence of a optimal Word document. If you try to read this PDF with Jaws, pression of Insert+F6 (for intercept headers) do nothing… Jaws speak no headings in this page. Is necessary work on role map of tag structure.

  5. G F Mueden
    July 23rd, 2009 at 17:00 | #5

    Oh dear. None of the things mentioned speak to why I hate pdf files. I need to cut the columns short and enlarge them to about half screen width. never been able to do it with pdf files but I can do it with HTML. The other reason is that that i can’t convert a three or two column format to a single column and scroll down without the see-saw down-up-and over-and down again needed to read a multicolumned report. Pdf is great foe hard print, but for the web, give me HTML any day.
    BTW, why can’t the font in this text entry box be as legible as the rest of the page? gf mueden@verizon.net ===gm===

  6. July 24th, 2009 at 08:06 | #6

    You’re using “attenuate” incorrectly.

    PDF/UA extensively considered the problem of spacer GIFs and other meaningless graphics in the Web context and determined they almost never come up in the PDF context. If they do, tag them as artifacts.

    G.F. Mueden, tagged PDFs can be reflowed into a single column at will.

  7. jeb
    July 24th, 2009 at 09:50 | #7

    Thank you for your comment. I will admit that I did not test these documents with JAWS. But I can report that when I reviewed the tags in documents made this way, they did show the Headings made in Word did convert to Headings (in the same hierarchical order) in Acrobat. Whether JAWS can accurately read these or now, it a different issue.

    Perhaps I did not put enough of a point on it, but my clear intention was to motivate people to test their documents before making them public. As I am not a JAWS user, it is easy for me to test with this application and I rely on the accessibility checker built into Acrobat Professional. I guess if you are suggesting that the results of that test are incorrect, then Adobe needs to be made aware of this and fix it.

    Thank you again for you contribution to this conversation.

  8. Thomas
    July 24th, 2009 at 09:59 | #8

    Anyone use OxygenOffice or OpenOffice? These office packages convert to pdf well enough, from what I can tell. There is also Scribus…Not sure about accessibility issues yet.

  9. jeb
    July 24th, 2009 at 10:01 | #9

    Thank you Deborah for your thoughtful comments and compliments.

    Perhaps I should have been more clear in the background on my choice – and recommendation – of the Office Ad-in as opposed to Acrobat’s PDFmaker. Truth is, I am still running with version 8 of Acrobat Professional. I’ll have to get a few more contracts and build a few more websites before I can afford to upgrade…

    That said, the PDFmaker plugin for ver 8 worked fine with MS Office 2003, but crashed and burned with MS Office 2007 when I upgraded to the new Office. I’ve read the suggestions on making PDFmaker work with Office 2007, but found nothing worked for me. So I switched over the the MS plug-in which so far seems to be running fine.

    My understanding from the Acrobatusers webinar was that the PDF maker plugin v.9 was more robust (than v.8) and would apparently check for accessibility problems within the conversion process and fix them then. The MS plug-in does not do this testing, so I test the document after the conversion has been made. I guess it doesn’t really matter when you do it, the important part is that you do test the document before distributing it to the public.

    Thanks again for your contribution to this conversation.

  10. July 26th, 2009 at 10:28 | #10

    Sure, but without semantics a file is accessible? This method produce a “technical accessibility”, verified with Adobe rules, is all.
    The same result with Indesign, all paragraph styles go very well in PDF structure. But for screen readers file a PDF like this is same a txt. Txt is accessible? Uhm, for me not.
    I have some writings on this problem in my little blog, http://www.biroblu.info, but only in italian, my english is very poor sorry :-) .
    I am working on a big project also for an italian publisher, for producing accessible electronic books. I have some material in a test site for now, test.biroblu.info/uno, for an accessible flow with Indesign and Acrobat.

  1. No trackbacks yet.