Document objects

The main Document and related objects.

Document constructor


Return a Document object loaded from docx, where docx can be either a path to a .docx file (a string) or a file-like object. If docx is missing or None, the built-in default document “template” is loaded.

Document objects

class docx.document.Document[source]

WordprocessingML (WML) document. Not intended to be constructed directly. Use docx.Document() to open or create a document.

add_heading(text=u'', level=1)[source]

Return a heading paragraph newly added to the end of the document, containing text and having its paragraph style determined by level. If level is 0, the style is set to Title. If level is 1 (or omitted), Heading 1 is used. Otherwise the style is set to Heading {level}. Raises ValueError if level is outside the range 0-9.


Return a paragraph newly added to the end of the document and containing only a page break.

add_paragraph(text=u'', style=None)[source]

Return a paragraph newly added to the end of the document, populated with text and having paragraph style style. text can contain tab (\t) characters, which are converted to the appropriate XML form for a tab. text can also include newline (\n) or carriage return (\r) characters, each of which is converted to a line break.

add_picture(image_path_or_stream, width=None, height=None)[source]

Return a new picture shape added in its own paragraph at the end of the document. The picture contains the image at image_path_or_stream, scaled based on width and height. If neither width nor height is specified, the picture appears at its native size. If only one is specified, it is used to compute a scaling factor that is then applied to the unspecified dimension, preserving the aspect ratio of the image. The native size of the picture is calculated using the dots-per-inch (dpi) value specified in the image file, defaulting to 72 dpi if no value is specified, as is often the case.


Return a Section object representing a new section added at the end of the document. The optional start_type argument must be a member of the WD_SECTION_START enumeration, and defaults to WD_SECTION.NEW_PAGE if not provided.

add_table(rows, cols, style=None)[source]

Add a table having row and column counts of rows and cols respectively and table style of style. style may be a paragraph style object or a paragraph style name. If style is None, the table inherits the default table style of the document.


A CoreProperties object providing read/write access to the core properties of this document.


An InlineShapes object providing access to the inline shapes in this document. An inline shape is a graphical object, such as a picture, contained in a run of text and behaving like a character glyph, being flowed like other text in a paragraph.


A list of Paragraph instances corresponding to the paragraphs in the document, in document order. Note that paragraphs within revision marks such as <w:ins> or <w:del> do not appear in this list.


The DocumentPart object of this document.


Save this document to path_or_stream, which can be either a path to a filesystem location (a string) or a file-like object.


A Sections object providing access to each section in this document.


A Settings object providing access to the document-level settings for this document.


A Styles object providing access to the styles in this document.


A list of Table instances corresponding to the tables in the document, in document order. Note that only tables appearing at the top level of the document appear in this list; a table nested inside a table cell does not appear. A table within revision marks such as <w:ins> or <w:del> will also not appear in the list.

CoreProperties objects

Each Document object provides access to its CoreProperties object via its core_properties attribute. A CoreProperties object provides read/write access to the so-called core properties for the document. The core properties are author, category, comments, content_status, created, identifier, keywords, language, last_modified_by, last_printed, modified, revision, subject, title, and version.

Each property is one of three types, str, datetime.datetime, or int. String properties are limited in length to 255 characters and return an empty string (‘’) if not set. Date properties are assigned and returned as datetime.datetime objects without timezone, i.e. in UTC. Any timezone conversions are the responsibility of the client. Date properties return None if not set.

python-docx does not automatically set any of the document core properties other than to add a core properties part to a presentation that doesn’t have one (very uncommon). If python-docx adds a core properties part, it contains default values for the title, last_modified_by, revision, and modified properties. Client code should update properties like revision and last_modified_by if that behavior is desired.

class docx.opc.coreprops.CoreProperties[source]

string – An entity primarily responsible for making the content of the resource.


string – A categorization of the content of this package. Example values might include: Resume, Letter, Financial Forecast, Proposal, or Technical Presentation.


string – An account of the content of the resource.


string – completion status of the document, e.g. ‘draft’


datetime – time of intial creation of the document


string – An unambiguous reference to the resource within a given context, e.g. ISBN.


string – descriptive words or short phrases likely to be used as search terms for this document


string – language the document is written in


string – name or other identifier (such as email address) of person who last modified the document


datetime – time the document was last printed


datetime – time the document was last modified


int – number of this revision, incremented by Word each time the document is saved. Note however python-docx does not automatically increment the revision number when it saves a document.


string – The topic of the content of the resource.


string – The name given to the resource.


string – free-form version string