PDF - Scanned - Accessibility by Design

Priority Tasks

The Priority Tasks below are based on the emerging steps of the Electronic Accessibility Rubric. Completing the Priority Tasks will help you meet the universal design goal for scanned PDFs. Using the rubric helps to prioritize the easiest steps and those that will have the most impact.

The Advanced Tasks are necessary to make a document fully accessible, but we advise mastering the priority tasks first.

Priority Tasks

When possible, avoid photocopying or scanning to PDF. If you can locate higher quality digital originals, your PDF documents stand a much better chance of being accessible.

Download PDFs directly from the publisher or research database.
Create them by converting your electronic documents directly to the PDF format.
Utilize Colorado State University’s Course Reserve service. PDFs provided through Course Reserve have recognized text.

When creating a scanned PDF file, the first step is to make sure that the scan is as clear and readable as possible. Without a good quality original, the result will still be inaccessible even if most of the text is recognizable.

Remove any handwriting, underlining, etc.
Check for lines that are cut off
Check for blurry text
All pages should be in the correct orientation for reading onscreen
Copies should be of only one page at a time
Check for partial pages or seams
Colored text can cause difficulties with color contrast and text recognition may be inaccurate

Here is an example of a poor-quality photocopy from a book. Some of the words near the crease of the spine are cut off, and the text includes handwritten underlining and notes.

Poor quality photocopy of two pages in a book

Assistive technology will have a difficult time interpreting this document, since it can’t guess at the missing words and the underlining could prevent it from recognizing the text accurately.

Good questions to ask yourself are:

Can I read this easily?
Do I have to guess at any of the words?
Do I have to turn my head or rotate the document to read it?
Is more than one page visible?

The following video shows the user experience when a PDF is not made from a clean copy.

In a web browser, sign in to Adobe Cloud. You can create a free account if necessary. The OCR tool is free to use.

Click the “Select a file” button to upload a scanned PDF.

You may be prompted to sign up for a free trial of an Adobe subscription. You can ignore this – a paid subscription is not necessary to use the OCR Tool.

Upload a scanned PDF using the "Select a file" button

Once the file is uploaded, select the document language, then choose the button to “Recognize text.”

Select the document language in the dropdown menu and then the "Recognize Text" button

The document will display in the browser window. Check that you can select text in the document. Then click the download button in the top menu.

Keep the default choices to download “this PDF” and select “Download PDF.”

Select "This PDF" and choose "Download PDF"

Contact your IT department to request Adobe Acrobat Professional. Licensing may already exist.

Video Tutorial

Written Tutorial

Open the Scan & OCR tool in the toolbar (formerly the Enhance Scans tool).

In the toolbar that opens at the top of the page, select Recognize Text, then In This File.

Acrobat DC Recognize Text toolbar with "In this file" selected

Choose from the All Pages drop-down menu to do either All Pages or a range of pages (it is better not to do longer documents all at once), then click the Recognize Text button. Once the tool finishes running, you should be able to highlight text in the document. Save the document before closing.

Acrobat DC - All pages selected to recognize text

Advanced Tasks - Using Acrobat Professional

To set the Document Title, open File, Properties. On the Description tab, type a descriptive document title in the Title field.

Next, set the document title to display instead of the file name when the document is open.

Switch to the Initial View tab in the Properties window.
Look for the Show drop-down menu under Window Options.
Change the selection from File Name to Document Title.

Properties window, Initial View tab with document title selected in the Show dropdown menu

To set the primary language of the document, open File, Properties. On the Advanced tab, find the Language drop-down menu under Reading Options. Set the language to the primary language of the document.

Properties window, Advanced tab with the language set to English

Tags are how Adobe designates document structure in a PDF. They provide an additional layer of code to the PDF that can be accessed by assistive technology.

Tags indicate reading order as well as headings and alternative text and must be added to a scanned PDF.

To add tags in Acrobat DC, open the Accessibility Toolbar, and click on Autotag Document.

Autotag Document option in accessibility toolbar

In order to add a table header row to a table in Acrobat DC, select the table and click on Table Editor in the Reading Order window.

The table cells will be outlined in red and each cell will have a tag, either TH (table header) or TD (table data). The top row should be marked with TH and remaining cells should usually be TD.

This example shows what a table should look like in the table editor:

A good example of a table in Acrobat Table Editor, with TH across the top row and TD in all the remaining cells.

To make the top row to a header row, each cell must be changed individually. Right click on a cell and select Table Cell Properties.

Context menu for a cell with Table Cell Properties selected

Change the Type from Data Cell to Header Cell and set the Scope to Column. Repeat with the each of the top row cells.

Table Cell Properties window with Header Cell radio button selected and scope set to Column

If the first column is a header column, set the Type to Header Cell, but the Scope to Row.

If images are present in the scanned PDF, you’ll also need to add alternative text. Use the Accessibility Tool to find and correct all images in the document at once.

Choose the Set Alternate Text option in the Accessibility Toolbar.

Add a description in the text field, then use the arrow key to jump to the next image in the document. Choose Save & Close when finished.

To check the accuracy of autotags, open the Reading Order Panel. On the Accessibility Toolbar, select Reading Order.

Reading Order on the Accessibility Toolbar

Select Show Order Panel on the Reading Order window:

Show Order Panel button on Reading Order window

This opens a panel on the left side of the document. The numbers in the panel correspond to the numbers highlighted on the page. These numbers are the visual representation of the tags.

A line connects tag number 1 in the document to number 1 in the panel

If the numbers are not in a logical order, drag the tag into the correct position in the panel.

Tag 8 should be tag 3. Drag upwards in the side panel to re-order.

Note: Acrobat has a bug when dragging tags that sometimes causes them to disappear. There is no “undo” when editing tags. Save often so you can recover easily if something goes wrong.

As with any other document type, it’s important to include headings in a PDF. The Reading Order tool allows you to check for document structure.

On the Reading Order window, change the radio button from Page content order to Structure types.

You’ll see that the numbers on the page that showed reading order have changed to structure tags such as P for Text/Paragraph and H1 for Heading 1.

This example has tags, but they are all set to P, leaving the document without a heading structure:

Section title and paragraph text both have P tags

To set the headings, select the area that needs a different tag. There are two ways to do this.

Select an existing tag, if it only includes the relevant text. This will select the entire shaded area that belongs to the tag.
Use the cursor to draw a box around the text. This may be needed if text is grouped together. The selected text will have pink boxes drawn around it so you can be sure which text will be tagged.

In each case, once selected, use the heading buttons on the Reading Order tool to set the appropriate heading level.

Section title is selected with pink boxes around the text. Heading 2 is selected in the reading order window.

The tag will visually change in the document from P to H2, and the shaded area will break into sections.

Section title is labeled with H2 and the shading around the text is separate from the following paragraph.

The final (but most important) place to check the tags is in the Tag Tree. Using the Reading Order Panel first is important so that when you get to the Tag Tree, most of the structure should already be correct. There can be lingering issues with tags, however, and the Tag Tree is where you can manually fix those.

Navigating the Tags Pane

In the View menu, select Show/Hide, Navigation Panes, Tags.

View menu, show/hide navigation panes, tags

This opens a new panel on the left side of the document. The tags look like HTML. For example, <H1> is Heading 1, and <P> is paragraph text.

Tags showing hierarchy of headings and paragraph text

Content is nested inside the tags. If you expand a tag, it will outline the corresponding content on the page in pink, including nested content.

A heading and the paragraph text below it are both included in a section tag

Correcting the Reading Order

Begin by clicking down the tag tree to see if the order corresponds with the visual order on the page. You can drag tags into the correct order in the Tags pane.

In this example, the paragraph describing the chapter is at the bottom of the tags. It should be directly underneath the chapter title.

A line appears as you drag it to indicate where it will move to when you stop dragging. Be careful not to accidentally nest one tag inside another.

Line that indicates position when dragging a tag

If tags are nested that shouldn’t be, drag them out into the higher level. For example, here we have two sections nested inside another section. They all should be at the same level within the tree.

Drag the nested section out. It should end up at the same level as the section it is currently nested in.

Sometimes it takes significant manual editing to correct the document structure. These are some of the problems you should look for.

Document Tag

All of the tags in a PDF should be within an overall Document tag. Currently, this is not created by default and must be added manually.

Click on the Tag Options drop-down menu and select New Tag.

Tag Options drop-down menu with New Tag selected

Select Document as the tag type.

New Tag dialog with Document as the tag type

Drag all of the other tags so they’re nested inside the Document tag. Select all of the tags at once so you can drag them into the Document tag in one step. This will maintain the reading order.

Incorrect Tag Labels

Tags may be labeled incorrectly, as in this example of a heading that’s labeled as paragraph text.

Paragraph tag corresponding with chapter title in content

Right-click on the tag and select Properties.

Use the Type drop-down menu to select the appropriate heading level.

Tag properties window showing type selection as heading level 1

Untagged Content

All content should be tagged. If you find content that is skipped as you go down the tag tree, first see if there’s a tag associated with it that’s simply out of order.

Select the content in the document. Click on the Tag Options drop-down menu and select Find Tag From Selection.

Tag Options menu with Find Tag from Selection

If a tag already exists, drag it into the correct order in the tags pane.
If no tag is found, create a new tag. Select the content in the document and then click Tag Options, Create Tag from Selection. Once the tag is created, drag it into the correct order in the tag pane and make sure it’s labeled correctly.

Tag options menu with Create Tag from Selection

Empty Tags

A tag is empty if there’s no way to expand it because nothing is nested inside it. Empty tags create confusion and they need to either be filled or deleted.

Check first to see if the content that should be in the tag is in another tag. If so, move the content into the empty tag.
Once you’re sure the tag is not needed, right-click on it and select Delete Tag.

If your scanned PDF includes complex images, charts or graphs, these may need more than simple alternative text. Determine which images a reader truly needs to comprehend to understand the text and provide a long description.

Since the layout of a scanned PDF isn’t editable, providing an appendix with long descriptions is the best option.

Visit Long Description page

Visit PDF Accessibility Checkers page

Use These Concepts in Scanned PDF

Alternative Text

Headings

Long Description

PDF Accessibility Checkers

Reading Order

Table Header Row

Text Recognition - Optical Character Recognition

Accessibility by Design PDF – Scanned

Priority Tasks

Priority Tasks

Locate Digital Originals if Possible

Make Your Own Good Quality Copy

Recognize Text with Adobe Cloud OCR Tool

Recognize Text with Adobe Acrobat Professional DC

Video Tutorial

Written Tutorial

Advanced Tasks - Using Acrobat Professional

Set the Document Title

Set the Document Language

Add Tags

Table Header Row

Alternative Text

Reading Order Panel - Page Content Order

Reading Order Panel - Structure Types

Tag Tree - Reading Order

Navigating the Tags Pane

Correcting the Reading Order

Tag Tree - Page Structure

Document Tag

Incorrect Tag Labels

Untagged Content

Empty Tags

Images - Long Description

Accessibility Check

Use These Concepts in Scanned PDF