Hi everyone!
Document management and automation remain overarching challenges in AI adoption.
In Japan in particular, the strong cultural reliance on Excel for document management has become a bottleneck for effective AI utilization.
Objectives
-
Balancing document conversion for AI with human-readable documentation
-
Standardizing into Markdown format for easier AI processing
-
Building a knowledge base optimized for human readability
Results
-
Automated workflow confirmed: Store files in S3 β Convert to Markdown β Save to Growi
-
Supported formats: DOCX, PDF, TXT, CSV, XLSX
Processing Flow
-
Batch processing: Automatically detect and convert all files in the input/ folder in S3
-
AI-powered conversion: High-accuracy Markdown transformation using Docling MCP and Gemini AI
-
Auto-formatting: Optimize heading hierarchy, tables, and code blocks
-
Growi integration: Automatically save converted results into the knowledge base using Growi MCP and Gemini AI
Achievements
-
For AI: Structured conversion into Markdown
-
For humans: Improved information access through Growiβs search and browsing functions
-
For efficiency: Zero manual work for document standardization
Current Limitations
- Files containing images are not yet supported (under consideration)