Release Notes

Track new features, improvements, and fixes for the APSIS Document Analysis API.

v1.2.0 (Coming Soon)

Expected: Week commencing 5th January 2026

Drawing Compare Improvements

Major improvements to the Drawing Compare feature are in progress:

Faster Processing - Significant reduction in comparison time with optimized image alignment
Better Accuracy - Improved change detection to reduce false positives and more precise highlighting

Note: This work is part of December development and will not be billed in January.

v1.1.0

Live Now - January 2026

Highlights

Multiple Authors - Now detects all company names on drawings, not just the first
Confidence Indicators - Know when extractions are high or low confidence
Faster Processing - Parallel batch processing with reduced timeouts
Full Cost Visibility - Every API call tracked with token counts and costs

API Changes (Action Required)

These changes may require updates to your integration:

1. Author Field Type Change

The author field for drawings is now an array to support multiple authors.

Before (v1.0.0)	After (v1.1.0)
`"author": "Smith & Associates"`	`"author": ["Smith & Associates", "Jones Engineering"]`

2. New Fields Added

Field	Type	Description
`extraction_method`	`string`	`yolo_crop` (high confidence), `heuristic_crop` (low confidence), or `null`

3. Date Format Standardised

Dates are now returned in DD-MM-YYYY format (UK standard).

Before (v1.0.0)	After (v1.1.0)
`"2024-01-15"` or `"January 15, 2024"`	`"15-01-2024"`

New Features

Multiple Authors Detection

The system now detects and returns all company names and authors found on a drawing, not just the first one. Useful for drawings involving multiple firms (e.g., architect + structural engineer).

Extraction Confidence Indicator

Each extracted drawing includes a confidence level:

Confidence	Meaning
High (`yolo_crop`)	Object detection model found and cropped the title block area
Low (`heuristic_crop`)	Title block not detected; used fallback (bottom 50% of page)

Low confidence extractions may be less accurate but still provide useful data in most cases. Standard formats with title blocks in the bottom right typically extract well.

Smarter Filename Recognition

The system uses PDF filenames as hints for author detection. Current mappings:

Filename Pattern	Author
`271_*`	Convery Prenty Architects
`649*`	Designme
`22007_` or `21014_`	Apsis
`7871_*`	Grossart Associates
`3077-CDP-*`	Clyde Design Partnership
`1477-ABC-*`	Anderson Bell + Christie

We can add more mappings as needed - just let us know.

Automatic Search Indexing

Documents are automatically indexed for Document Search after extraction - no additional steps required.

Improvements

Performance

Documents processed in parallel batches, significantly reducing wait times
Timeout reduced from 2 minutes to 1 minute per document
Results returned immediately while embedding (indexing) happens in background

Accuracy

Upgraded to GPT-5.2 model for reading title block information
Improved padding around title blocks to avoid cutting off edge fields
Better classification to distinguish Drawings from Specifications
Author names and drawing titles formatted in Title Case
Dates standardised to DD-MM-YYYY format

Reliability

Long text fields handled gracefully instead of causing errors
Better recovery from temporary failures

Bug Fixes

Fixed issue where very long drawing titles could cause extraction to fail
Improved handling of unusual drawing formats

Cost & Transparency

Full Cost Reporting

Every API call is now tracked via LangSmith, enabling:

Token counts and costs per document
Daily/weekly/monthly usage reports
Cost analysis broken down by document or time period

LangSmith Tracing UI

LangSmith Cost Dashboard

Cost Example

Processing 6 construction drawings:

File
0195 BW-SL-010—LANDSCAPE LAYOUT.pdf
0195 BW-SL-011-A-FINISHES & BOUNDARY TREATMENT.pdf
0195 BW-SL-012-A-PLOT WORKS PLOTS 1-7.pdf
0195 BW-SL-013—PLOT WORKS PLOTS 8-17.pdf
0195 BW-SL-014—PLOT WORKS PLOTS 18-34.pdf
0195 Newburgh Issue Sheet-1.0-SB Architects -DET.pdf

Metric	Value
Files processed	6 drawings
Total tokens	30,862
Total cost	$0.08
Cost per drawing	~$0.01

Example Upload Cost Breakdown

Cost Optimisations

Classification images compressed, reducing API costs by ~95%
Large drawings automatically resized before processing

Technical Notes

Fine-Tuning Research

We trained a custom APSIS model for title block extraction. While accurate, it was significantly slower than the standard model. After weighing trade-offs, we chose GPT-5.2 for its similar quality and faster processing times.

Current Infrastructure

Model Hosting: The system uses a Tier 4 GPT-5.2 model hosted on the Agency AI OpenAI account, providing high rate limits (2M tokens/min) with pay-as-you-go billing.

Future Options To Consider:

Option	Pros	Cons
APSIS OpenAI Account	Direct billing to APSIS, same API	Requires $250 spend to reach Tier 4 rate limits
APSIS Azure OpenAI	UK data residency, Azure AD integration, enterprise SLA	Lower default rate limits (must request increases from Microsoft)

Key Trade-offs:

OpenAI Direct: Higher rate limits automatically, latest models first, but data processed in US
Azure OpenAI: Better for UK/EU compliance and enterprise integration, but slower rate limit approvals

Rate Limiting: Currently no per-user limits are enforced. Options to consider include per-user token quotas, concurrent request limits, or hosting multiple model instances with round-robin distribution. Need to have a discussion with Alex/Paul on this asap.

v1.0.0

November 2024

Initial release of the APSIS Document Analysis API.

Features:

Extract Data - Automatically extract metadata from drawings and specifications
Detect Revision - Track document versions and predict next revision numbers
Compare Specifications - Visual diff and AI analysis of specification changes
Document Search - Semantic search using natural language
Drawing Compare - Visual comparison of construction drawings