426 Commits

Author SHA1 Message Date
Gunnsteinn Hall
c9439c80ed
Merge branch 'v3' into extract.text 2018-11-29 08:51:58 +00:00
Peter Williams
88c9b05dff Merge branch 'v3' of https://github.com/unidoc/unidoc into extract.text 2018-11-29 17:12:40 +11:00
Peter Williams
94dca18b60 removed a comment 2018-11-29 17:09:45 +11:00
Peter Williams
7bbcec65fa Made Matrix and Point structs more general and moved them to their own files in pdf/model. 2018-11-29 17:04:20 +11:00
Denys Smirnov
27efe08a26 cmap: remove global for missing code; should replace the rune afterwards 2018-11-29 04:52:23 +02:00
Denys Smirnov
8a4c4069b7 textencoding: unexport CodeToGlyph field 2018-11-29 04:42:35 +02:00
Denys Smirnov
6fddd80eba textencoding: assert the type of differences map 2018-11-29 04:40:25 +02:00
Denys Smirnov
7c8d88185c fonts: assert type of another map; add some comments 2018-11-29 04:30:37 +02:00
Denys Smirnov
b91c1b8c61 model: remove unnecessary typ names in font initialization 2018-11-29 04:19:29 +02:00
Denys Smirnov
46d22eac31 fonts: introduce types for GIDs and char codes; fix shadowing bug 2018-11-29 04:19:29 +02:00
Denys Smirnov
ab62ff5060 fonts: specify rune type as a key for Chars and runeToWidth 2018-11-29 04:19:29 +02:00
Denys Smirnov
6c0fd1e780 cmap: mapped values are runes, not strings 2018-11-29 04:19:29 +02:00
Gunnsteinn Hall
e6b768c06c Remove GetAverageCharWidth 2018-11-29 01:09:34 +00:00
Denys Smirnov
5b0eaf3f3a creator: make output stable when using custom fonts; fixes #232 2018-11-29 02:56:26 +02:00
Gunnsteinn Hall
f04f83b271 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into v3-peterwilliams97-extract.text 2018-11-28 23:33:31 +00:00
Gunnsteinn Hall
d29f9a6a34 Adding Height and Width methods for PdfRectangle 2018-11-28 23:25:31 +00:00
Gunnsteinn Hall
520ab09a72 Addressing review comments 2018-11-28 23:25:17 +00:00
Peter Williams
da8544e68b Moved Matrix code to model/matrix.go 2018-11-28 22:29:35 +11:00
Peter Williams
36a1148962 Combine diacritics in text extraction. 2018-11-28 18:06:03 +11:00
Peter Williams
f373881a48 Removed some unused struct fields. 2018-11-27 13:37:12 +11:00
Peter Williams
c898ce847a Removed non-text-extraction code. 2018-11-26 18:00:15 +11:00
Peter Williams
478c5dfe56 removed debug code 2018-11-26 17:45:41 +11:00
Peter Williams
536c688001 Fixed orientation handling in text extraction. 2018-11-26 17:17:17 +11:00
Peter Williams
a2024b8e29 Use char width 250 for standard 14 font characters without given char metrics. 2018-11-23 11:21:51 +11:00
Peter Williams
92e3e455c2 Merge branch 'v3' of https://github.com/unidoc/unidoc into extract 2018-11-22 22:03:26 +11:00
Peter Williams
6e5e32dd92 Fixed encoding selection for standard 14 fonts. 2018-11-22 22:01:04 +11:00
Peter Williams
8b964f2008 Set font even when Tf operator is not between BT and ET. 2018-11-21 13:14:11 +11:00
Peter Williams
dcb2b14d55 Handle standard 14 TrueType fonts and stanard 14 font aliases in text extraction. 2018-11-20 17:49:37 +11:00
Peter Williams
cad144cec3 Handle missing widths in text extraction 2018-11-20 15:49:28 +11:00
Peter Williams
2f8b50af75 Fixed landscape rotation for text extraction.
Also compute metrics for standard 14 fonts when not created from dict.
2018-11-19 16:50:28 +11:00
Peter Williams
a9019a50a3 Fixes for text extraction corpus testing.
- Correct matrix multiplication order in text.go
- Look up standard 14 font widths after applying custom encoding.
2018-11-18 17:21:30 +11:00
Denys Smirnov
2d7d6334bc fonts: add tests for ttf parser 2018-11-17 15:03:38 +01:00
Denys Smirnov
86a30df78c fonts: floats should be signed 2018-11-17 15:03:34 +01:00
Peter Williams
851aa267b1 Added test for position based text extraction 2018-11-12 11:04:09 +11:00
Peter Williams
a1d5e8dc45 Cleaned up some comments. 2018-11-10 21:41:47 +11:00
Peter Williams
75aa370467 Updated font_test.go for treating æ and Æ as letters rather than ligatures. 2018-11-10 08:56:47 +11:00
Peter Williams
a2342ec6c6 First attempt at getting font metrics by character code. 2018-11-08 15:20:12 +11:00
Denys Smirnov
c8c7a03896 fonts: fix glyph id bounds check 2018-11-07 22:09:57 +02:00
Denys Smirnov
08c1fe4ed4 fonts: remove unused field 2018-11-07 22:09:57 +02:00
Peter Williams
a6ce81c001 Merge branch 'render.v3.hungarian' into extract 2018-11-02 15:13:48 +11:00
Peter Williams
3da4ffc5aa Merge 2018-11-01 21:33:51 +11:00
Peter Williams
7217946134 Merge branch 'v3' of https://github.com/unidoc/unidoc into render.v3.hungarian 2018-11-01 15:02:08 +11:00
Peter Williams
b0c440dd00 Fixed text position tracking. 2018-10-30 21:55:30 +11:00
Peter Williams
5e8ca9c18c Fixed code->glyph mapping for TrueType fonts for raw number gid 2018-10-29 09:08:32 +11:00
Gunnsteinn Hall
4e2e3defba Merge branch 'v3' into v3-enhance-forms 2018-10-23 12:09:01 +00:00
Gunnsteinn Hall
d756c17011 Addressing PR 238 review comments 2018-10-23 12:03:47 +00:00
Gunnsteinn Hall
8007138bd3 Addressing PR 238 review comments 2018-10-23 11:43:02 +00:00
Peter Williams
b23600c9f4 Merge branch 'render.v3.hungarian' into extract 2018-10-23 10:59:59 +11:00
Peter Williams
5d15dc97dd Removed code with problematic provenance. 2018-10-23 10:44:58 +11:00
Peter Williams
86108bd2b9 Build font descriptor literals from .afm files 2018-10-23 10:36:38 +11:00