Adobe acrobat 7.0.5 sdk User Manual

Page of 122
53
Working with PDF Documents
Using Document Logical Structure
5
Using Document Logical Structure
As discussed in 
, you can insert logical 
structure into a PDF document by creating a tagged PDF document.
The 
PDSEdit
 API provides the ability to add, modify and view this logical structure. For 
more information, see the Acrobat and PDF Library API Overview.
Navigating a PDF Document
PDSEdit
 methods allow you to navigate through a document according to its structure. 
Bookmarks made from structure can go to an individual paragraph or a whole section, 
rather than just to a point on a page. 
PDSEdit
 also allows searching within structure 
elements. For example, you can search for a word within elements of a certain type, such as 
headings. It can be used to move around a document, to analyze its content, and to 
traverse its hierarchical structure. 
Extracting Data From a PDF Document
The 
PDSEdit
 API allows you to extract portions of pages according to their context, such 
as all of the headings or tables. The extracted data can be used in different ways, such as 
summarizing document information, importing the data into another document, or 
creating a new PDF document.
Adding Structure Data to a PDF Document
Authoring applications create documents that can be converted to PDF. When the 
document is converted to PDF and viewed, Acrobat does not automatically add structure 
to the document. You can add structural information to any PDF file with the 
PDSEdit
 API. 
Once a file has logical structure, you can use 
PDSEdit
 to access and modify that structure.
Using pdfmark to Add Structure Data to PDF
Authoring applications may add structure pdfmarks to the PostScript language code 
generated when a document is printed. When the Acrobat Distiller application creates a 
PDF file from such PostScript code, it generates structure information in the PDF file from 
the pdfmarks. This approach requires that either the authoring application to add structure 
pdfmarks to the PostScript code it generates, or that some other application generates the 
pdfmarks. See 
 and the pdfmark 
Reference for more information.