Primary machine: pomelo (local dev/operations, aka home.critchley.biz)Web server: noodle (production, hosts www.critchley.biz and cv.critchley.biz)Jobs report API: home.critchley.biz/jobs (reverse proxy to pomelo, only works when pomelo is at home)Project path: /home/john/py/ask/Jobs path: /home/john/py/ask/jobs/
OpenAI API Key:• Stored in: ~/.env_data (GDBM format)• Accessed via custom gdata library (from gdata-server project)• Keys tried: 'api_key' then 'OPENAI_API_KEY'• Also accepted from OPENAI_API_KEY environment variable (checked first)• Inspect with: gdbm_dump ~/.env_data or gdbmtool• Refreshed periodicallyNote: list_models.py uses a different config method - reads ~/py/popit3/.openai.yaml for OpenAI client params. This is a separate config path from ask_with_files_structured.py.
Core Python packages:• openai - OpenAI API client• gdata - Custom GDBM wrapper (from gdata-server project, NOT Google gdata)• python-docx - DOCX file generation (used by document_generator.py)• PyYAML - YAML/JSON parsingText extraction (optional, graceful fallback):• pdfplumber - PDF text extraction (preferred)• pypdf or PyPDF2 - PDF fallback alternatives• beautifulsoup4 - HTML parsing and tag removal• markdown - Markdown to HTML conversion• odfpy - ODT file parsingInstall core: pip install openai python-docx PyYAML pdfplumber beautifulsoup4
ask_with_files_structured.py - /home/john/py/ask/schema.json - /home/john/py/ask/document_generator.py - /home/john/py/ask/jobs/apply.sh and variants - /home/john/py/ask/jobs/CV source - ~/CV/cv_llm_optimized.mdAPI key - ~/.env_data (GDBM)Old version - /home/john/py/ask/old/ask_with_files.py (no structured output support)
Basic usage:python3 ask_with_files_structured.py file1.pdf file2.docx -q 'Your question'With structured output:python3 ask_with_files_structured.py doc.txt -q 'Question' --schema-file schema.jsonJob application (end-to-end):cd /home/john/py/ask/jobs && ./apply.sh JS-131214.txt
ask_with_files_structured.py handles:• FileNotFoundError - Missing input files• ImportError - Missing optional libraries (helpful error message)• ValueError - Duplicate filenames, invalid schema, missing API key• openai.AuthenticationError - Bad API key• openai.RateLimitError - API rate exceeded• openai.APIError - General API failures• KeyboardInterrupt - Clean exit with code 130document_generator.py handles:• KeyError - Missing required keys (cv, cv.filename)• ValueError - Unknown section type (fails loudly for schema drift)• FileNotFoundError - Missing data file• OSError - File write failures