Docling
DoclingLoader #
Bases: BaseLoader
A document loader that uses the docling library to extract and structure content from various file types
including PDF, DOCX, and HTML.
For more information, see Docling
Attributes:
| Name | Type | Description |
|---|---|---|
detached_tables |
bool
|
If True, separates extracted tables from the main document text and treats them as individual documents. Default is False. |
export_table_format |
str
|
Format used when exporting tables. Applicable only if |
load_data #
load_data(input_file: str, **kwargs: Any) -> list[Document]
Loads data from the given input file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_file
|
str
|
File path to load. |
required |
Returns:
| Type | Description |
|---|---|
list[Document]
|
list[Document]: A list of |