33 Commits

Author SHA1 Message Date
Denys Smirnov
53687f854e Merge remote-tracking branch 'origin/v3' into extract.text
# Conflicts:
#	pdf/contentstream/processor.go
#	pdf/extractor/text.go
#	pdf/extractor/utils.go
#	pdf/internal/textencoding/winansi.go
#	pdf/model/font.go
#	pdf/model/font_composite.go
#	pdf/model/font_simple.go
#	pdf/model/font_test.go
#	pdf/model/fontfile.go
#	pdf/model/fonts/ttfparser.go
#	pdf/model/structures.go
2018-12-27 12:17:28 +02:00
Denys Smirnov
3687c83b37 errors should start with a lower case 2018-12-15 18:49:15 +05:00
Denys Smirnov
3f7ad73812 refactor some receiver and method names; fix typos in comments 2018-12-11 04:37:00 +02:00
Denys Smirnov
0a8b46daff don't use generic receiver names; make sure receiver name is consistent 2018-12-09 21:47:15 +02:00
Denys Smirnov
6d2c39043c make sure comments begin with a type/function name 2018-12-09 20:22:33 +02:00
Gunnsteinn Hall
2b1c796a74 Addressing review comments 2018-11-30 23:01:04 +00:00
Gunnsteinn Hall
283c9bf778 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into v3-peterwilliams97-extract.text.take2 2018-11-30 17:05:49 +00:00
Gunnsteinn Hall
33843599f2 Another round of addressing review comments 2018-11-30 16:53:48 +00:00
Peter Williams
f566fe5f68 Moved point.go and matrix.go back to their original locations. 2018-11-30 12:17:52 +11:00
Peter Williams
785a83e866 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into extract.text
NOTE: Fixed a text_test.go regression by modifying getCharCodeMetrics().
2018-11-30 10:46:33 +11:00
Gunnsteinn Hall
520ab09a72 Addressing review comments 2018-11-28 23:25:17 +00:00
Peter Williams
da8544e68b Moved Matrix code to model/matrix.go 2018-11-28 22:29:35 +11:00
Peter Williams
536c688001 Fixed orientation handling in text extraction. 2018-11-26 17:17:17 +11:00
Peter Williams
a815ca7271 Premultiply coordinate transforms to text matrix in text extraction. 2018-11-26 08:09:52 +11:00
Peter Williams
ea8a26a7dc Fixed text matrix multiplication order. 2018-11-19 14:19:50 +11:00
Peter Williams
851aa267b1 Added test for position based text extraction 2018-11-12 11:04:09 +11:00
Peter Williams
85cb1db004 Fixed position sorting for text extraction for landscape text. 2018-11-10 21:19:02 +11:00
Peter Williams
a6ce81c001 Merge branch 'render.v3.hungarian' into extract 2018-11-02 15:13:48 +11:00
Peter Williams
3da4ffc5aa Merge 2018-11-01 21:33:51 +11:00
Peter Williams
ee3e2a45a0 Update CTM 2018-10-29 15:49:15 +11:00
Gunnsteinn Hall
0d331d036f Update receiver name in ContentStreamProcessor 2018-10-15 10:23:25 +00:00
Peter Williams
2c8c8e5c98 Removed debugging code. 2018-10-09 19:05:38 +11:00
Peter Williams
f6dc3e2fc3 First attempt at splitting words in text extraction using a space detection heuristic 2018-10-09 11:49:59 +11:00
Peter Williams
44563f2cae Added fontMetrics to font loader and GetAverageCharWidth to PdfFont 2018-09-19 11:12:59 +10:00
Gunnsteinn Hall
23af4db2b3 Remove dot imports in contentstream pkg 2018-09-18 00:00:48 +00:00
Peter Williams
5bacca3437 formatting changes 2018-09-06 15:17:41 +10:00
Peter Williams
8ff8665149 First attempt at extraction based on a full PDF text parser. 2018-08-22 12:29:34 +10:00
Gunnsteinn Hall
e2bfa9094a Fixing lab colorspace component input ranges. Fix Indexed cs Image to rgb conversion. 2017-08-07 20:21:35 +00:00
Peter Williams
a1e1e31fee Merge remote-tracking branch 'upstream/master' into xmaster 2017-08-05 21:25:07 +10:00
Gunnsteinn Hall
22c2e5eb41 Address go vet issues 2017-08-04 22:50:28 +00:00
Peter Williams
141519790a Fixed some bugs found while getting pdf_descibe.go to work 2017-08-04 17:47:55 +10:00
Gunnsteinn Hall
e3c90b85b7 Cleaning up comments etc. 2017-04-05 18:05:38 +00:00
Gunnsteinn Hall
ce836c6f71 Support for Pattern, Shading objects. Various fixes and enhancements. 2017-04-04 05:51:58 +00:00