There are 3 ways of ingesting documents to Amazon Kendra:
BatchPutDocumentAPI that can take inline blobs and S3 locations for documents.
An index can include both unstructured text and frequently asked questions (FAQ):
Unstructured text: The following documents types containing unstructured text containing unstructured text in the following formats can be ingested into Amazon Kendra via connectors or the batchput interface.
Microsoft PowerPoint presentations
Microsoft Word documents
Plain text documents including JSON
FAQs and answers: Amazon Kendra’s Add FAQ capability can ingest question-answer pairs.
You can use the built-in connectors to ingest documents through the Kendra console. For a POC, if there no connector for your data source, you can mirror the data into an S3 bucket and use the S3 connector. To ingest a document directly, you can use the BatchPutDocument operation to ingest inline documents or a set of documents stored in an Amazon S3 bucket, add custom attributes to the documents, and to attach an access control list to the documents added to the index. You can find documentation about the
BatchPutDocument action here.