
Core Functionality
This application automates the transition from raw PDF files to searchable database records. It handles the entire lifecycle: uploading, secure storage, AI-driven extraction, and database persistence.
📄 Read Full Report (PDF, Finnish)
How It Works
- Ingestion & Storage: The Flask backend receives a PDF and uploads it to Azure Blob Storage.
- Secure Access: The app generates a SAS (Shared Access Signature) token, giving the AI service temporary, secure access to the private file.
- AI Extraction: Azure Document Intelligence parses the PDF and returns specific data points (skills, experience, contact info) as JSON.
- Database Storage: Flask receives the JSON and saves it directly into a MongoDB Atlas cluster.
Tech Stack
- Backend: Python (Flask)
- Storage: Azure Blob Storage
- AI/ML: Azure Document Intelligence
- Database: MongoDB Atlas