* fix: switch to pdftohtml for pdf to html conversions * build: include poppler-utils in dockerfile for pdftohtml