881 Commits

Author SHA1 Message Date
Denys Smirnov
7cdbb0c572 Merge remote-tracking branch 'origin/v3' into extract.text
# Conflicts:
#	pdf/internal/textencoding/truetype.go
#	pdf/model/font.go
#	pdf/model/font_composite.go
#	pdf/model/font_simple.go
#	pdf/model/font_test.go
#	pdf/model/fonts/ttfparser.go
2018-12-07 18:30:37 +02:00
Gunnsteinn Hall
12ea1a5927
Merge branch 'v3' into font_codes_strict 2018-12-07 15:15:27 +00:00
Gunnsteinn Hall
dc263c9820 Merge branch 'v3' into v3-peterwilliams97-extract.text.take2 2018-12-07 12:17:07 +00:00
Gunnsteinn Hall
1f56c18454 Address review comments 2018-12-07 10:32:49 +00:00
Denys Smirnov
4e24c0280a textencoding: rename variables and add relevant notes 2018-12-06 20:22:06 +02:00
Adrian-George Bostan
05b9ddcb2e Fix split text chunks containing link annotations 2018-12-03 20:01:50 +02:00
Peter Williams
8c1c2aa926 left-to-write -> left-to-right 2018-12-02 18:41:48 +11:00
Peter Williams
d2f1728672 Addressed review comments.
- Removed debug code.
- Explained magic constants
- Added file reference to PdfBox map.
2018-12-02 18:13:40 +11:00
Peter Williams
c4a39a1353 Look for CharMetrics for char code 32 when finding space width. 2018-12-02 13:12:10 +11:00
Peter Williams
835f329c28 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into extract.text 2018-12-02 10:02:16 +11:00
Peter Williams
9c258551ad Documented font code. Fall back to StandardEncoding when no encoding is speficied for a font. 2018-12-02 09:14:58 +11:00
Adrian-George Bostan
999b403ad4
Merge branch 'v3' into paragraph-link-support 2018-12-01 12:02:12 +02:00
Adrian-George Bostan
080cf29fa0 Further improve code documentation 2018-12-01 11:56:38 +02:00
Adrian-George Bostan
7912d378a9 Improve documentation comments 2018-12-01 11:47:22 +02:00
Gunnsteinn Hall
2b1c796a74 Addressing review comments 2018-11-30 23:01:04 +00:00
Adrian-George Bostan
7e7292dbff Add styled paragraph links test case. 2018-11-30 19:28:47 +02:00
Gunnsteinn Hall
283c9bf778 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into v3-peterwilliams97-extract.text.take2 2018-11-30 17:05:49 +00:00
Gunnsteinn Hall
33843599f2 Another round of addressing review comments 2018-11-30 16:53:48 +00:00
Adrian-George Bostan
4a6d5da26c Add default link style to paragraph 2018-11-30 18:45:48 +02:00
Adrian-George Bostan
9ac309464a Adjust annotation position for all paragraph alignments 2018-11-30 18:21:47 +02:00
Adrian-George Bostan
09e41acdf9 Reverse internal link coordinate system Y axis 2018-11-30 18:09:05 +02:00
Peter Williams
f566fe5f68 Moved point.go and matrix.go back to their original locations. 2018-11-30 12:17:52 +11:00
Peter Williams
785a83e866 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into extract.text
NOTE: Fixed a text_test.go regression by modifying getCharCodeMetrics().
2018-11-30 10:46:33 +11:00
Denys Smirnov
0436f2c974 validate shex length in cmaps; add comments 2018-11-29 23:43:00 +02:00
Denys Smirnov
fb4a087a93 textencoding: introduce GlyphName type 2018-11-29 23:24:40 +02:00
Gunnsteinn Hall
c9439c80ed
Merge branch 'v3' into extract.text 2018-11-29 08:51:58 +00:00
Peter Williams
f131af7b5a File missed in previous commit. 2018-11-29 17:50:43 +11:00
Peter Williams
88c9b05dff Merge branch 'v3' of https://github.com/unidoc/unidoc into extract.text 2018-11-29 17:12:40 +11:00
Peter Williams
94dca18b60 removed a comment 2018-11-29 17:09:45 +11:00
Peter Williams
7bbcec65fa Made Matrix and Point structs more general and moved them to their own files in pdf/model. 2018-11-29 17:04:20 +11:00
Denys Smirnov
27efe08a26 cmap: remove global for missing code; should replace the rune afterwards 2018-11-29 04:52:23 +02:00
Denys Smirnov
e79be78aae textencoding: simplify the code of computeTables 2018-11-29 04:45:39 +02:00
Denys Smirnov
8a4c4069b7 textencoding: unexport CodeToGlyph field 2018-11-29 04:42:35 +02:00
Denys Smirnov
6fddd80eba textencoding: assert the type of differences map 2018-11-29 04:40:25 +02:00
Denys Smirnov
7c8d88185c fonts: assert type of another map; add some comments 2018-11-29 04:30:37 +02:00
Denys Smirnov
b91c1b8c61 model: remove unnecessary typ names in font initialization 2018-11-29 04:19:29 +02:00
Denys Smirnov
46d22eac31 fonts: introduce types for GIDs and char codes; fix shadowing bug 2018-11-29 04:19:29 +02:00
Denys Smirnov
ab62ff5060 fonts: specify rune type as a key for Chars and runeToWidth 2018-11-29 04:19:29 +02:00
Denys Smirnov
6c0fd1e780 cmap: mapped values are runes, not strings 2018-11-29 04:19:29 +02:00
Gunnsteinn Hall
e6b768c06c Remove GetAverageCharWidth 2018-11-29 01:09:34 +00:00
Denys Smirnov
5b0eaf3f3a creator: make output stable when using custom fonts; fixes #232 2018-11-29 02:56:26 +02:00
Gunnsteinn Hall
f04f83b271 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into v3-peterwilliams97-extract.text 2018-11-28 23:33:31 +00:00
Gunnsteinn Hall
d29f9a6a34 Adding Height and Width methods for PdfRectangle 2018-11-28 23:25:31 +00:00
Gunnsteinn Hall
520ab09a72 Addressing review comments 2018-11-28 23:25:17 +00:00
Adrian-George Bostan
585470eebe Add styled paragraph support for internal link annotations 2018-11-28 22:19:30 +02:00
Adrian-George Bostan
e14d898abf Add styled paragraph support for external link annotations 2018-11-28 21:28:27 +02:00
Adrian-George Bostan
e89519d010 Add Clear method to PdfObjectArray 2018-11-28 21:24:20 +02:00
Peter Williams
da8544e68b Moved Matrix code to model/matrix.go 2018-11-28 22:29:35 +11:00
Peter Williams
ad83b1c948 In text extraction, split lines with tolerance on y coordinate. 2018-11-28 22:13:56 +11:00
Peter Williams
6529b42a70 Remove duplicate code. 2018-11-28 18:22:42 +11:00