1113 Commits

Author SHA1 Message Date
Denys Smirnov
3c5fc18b01 textencoding: refactor encodings; better handling for differences 2019-01-01 17:20:01 +02:00
Denys Smirnov
622ae5668d textencoding: generate table for WinAnsi encoding from CP1252 2019-01-01 17:20:01 +02:00
Denys Smirnov
ac7696693b fonts: describe few issues with the code; remove unused cmap type 2019-01-01 17:19:58 +02:00
Denys Smirnov
83d8086657 model: reformat TODOs 2018-12-28 16:48:38 +02:00
Gunnsteinn Hall
e1f2286f9c
Merge pull request #279 from dennwc/runes
Get metrics by rune instead of a glyph name
2018-12-28 13:09:51 +00:00
Gunnsteinn Hall
99b944b64e
Merge branch 'v3' into runes 2018-12-28 12:41:43 +00:00
Gunnsteinn Hall
84607f9914
Merge pull request #278 from unidoc/v3-update-jenkinsfile
Require extractor private testdata in builds
2018-12-28 12:41:20 +00:00
Denys Smirnov
f6506204d7 fonts: simplify code by getting width of runes in font instead of glyphs 2018-12-28 01:38:48 +02:00
Denys Smirnov
107718c711 fonts: comment about Wy font metric 2018-12-28 01:08:50 +02:00
Denys Smirnov
eb04b2d594 fonts: remove unused name field in char metrics 2018-12-28 01:08:47 +02:00
Denys Smirnov
87ebf6af8f creator: don't use fmt if not needed 2018-12-28 01:03:15 +02:00
Gunnsteinn Hall
12af4cf62a Jenkinsfile: Require extractor tests with private testdata in build 2018-12-27 22:47:39 +00:00
Gunnsteinn Hall
15b9123536
Merge pull request #256 from peterwilliams97/extract.text
Text extraction
2018-12-27 17:55:46 +00:00
Gunnsteinn Hall
99a19b0b8d remove duplicate log 2018-12-27 17:42:12 +00:00
Gunnsteinn Hall
8f031e7bdb remove panic in extractor 2018-12-27 17:18:52 +00:00
Denys Smirnov
dbbef4fd05 Merge remote-tracking branch 'peterwilliams97/extract.text' into extract.text
# Conflicts:
#	pdf/extractor/text.go
2018-12-27 12:40:55 +02:00
Denys Smirnov
8835230856 model: fix tests after the merge 2018-12-27 12:37:32 +02:00
Peter Williams
c70b66a00d Fixed incorrectly named variable. 2018-12-27 21:33:31 +11:00
Denys Smirnov
53687f854e Merge remote-tracking branch 'origin/v3' into extract.text
# Conflicts:
#	pdf/contentstream/processor.go
#	pdf/extractor/text.go
#	pdf/extractor/utils.go
#	pdf/internal/textencoding/winansi.go
#	pdf/model/font.go
#	pdf/model/font_composite.go
#	pdf/model/font_simple.go
#	pdf/model/font_test.go
#	pdf/model/fontfile.go
#	pdf/model/fonts/ttfparser.go
#	pdf/model/structures.go
2018-12-27 12:17:28 +02:00
Peter Williams
2fe54a4269 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into extract.text 2018-12-27 20:53:59 +11:00
Peter Williams
28957d37b8 fixed comment 2018-12-27 20:53:37 +11:00
Peter Williams
af99ee41db Recurse through form XObjects for text extractions. 2018-12-27 20:51:34 +11:00
Denys Smirnov
e729fa618d model: refactor CharcodesToUnicode to return string and remove TODO 2018-12-26 17:11:41 +02:00
Peter Williams
686a6e511e Merge branch 'v3-peterwilliams97-default-fontdescriptors' of https://github.com/unidoc/unidoc into extract.text 2018-12-21 16:32:33 +11:00
Gunnsteinn Hall
650dbf800c
Merge pull request #270 from dennwc/std14font
Replace Standard14Font with fonts.StdFont
2018-12-20 21:36:34 +00:00
Denys Smirnov
db8e50e457 model: fix wording in the comments 2018-12-19 16:59:13 +05:00
Denys Smirnov
217f984033 fonts: make standard font names type-safe 2018-12-19 16:55:27 +05:00
Denys Smirnov
85e1a02ac8 model: define an unexported pdfFont interface and remove error cases 2018-12-19 13:54:45 +05:00
Denys Smirnov
7f667d8fbb model: remove Standard14Font in favor of fonts.StdFont; resolves #269 2018-12-19 13:43:09 +05:00
Gunnsteinn Hall
2b718c9ba6
Merge pull request #260 from dennwc/font_interface
Preparations for a new font interface
2018-12-18 16:08:01 +00:00
Denys Smirnov
5bf2527b57 creator: clarify use of the default encoding and a way to override it 2018-12-15 19:39:59 +05:00
Denys Smirnov
e3704defc7 rename Typ1 font to StdFont 2018-12-15 19:39:55 +05:00
Denys Smirnov
19f95527b8 creator: remove SetEncoder from top 2018-12-15 18:49:15 +05:00
Denys Smirnov
62420700db fix case typos in errors 2018-12-15 18:49:15 +05:00
Denys Smirnov
3687c83b37 errors should start with a lower case 2018-12-15 18:49:15 +05:00
Denys Smirnov
4abbe49007 remove unnecessary encoder override; add todo to check other code paths 2018-12-15 18:47:39 +05:00
Denys Smirnov
d5a69b817c model: move CID font width array code to function and add a test case 2018-12-15 18:47:39 +05:00
Denys Smirnov
d3664d0f85 fonts: make metric tables for type1 fonts more compact by sharing glyphs 2018-12-15 18:47:39 +05:00
Denys Smirnov
3c8e70256d fonts: reuse metrics tables where possible 2018-12-15 18:47:39 +05:00
Denys Smirnov
0ef989c713 fonts: group similar fonts to a single file 2018-12-15 18:47:39 +05:00
Denys Smirnov
3b1a92701f fonts: remove redundant Type1 font interface implementations 2018-12-15 18:47:39 +05:00
Denys Smirnov
59f694d99f fonts: remove broken SetEncoder method for most fonts 2018-12-15 18:47:39 +05:00
Denys Smirnov
4c99e7a692 textencoding: remove unused error value when making winansi encoding 2018-12-15 18:47:39 +05:00
Denys Smirnov
81bb03763b font: discovered a bug in SetEncoder 2018-12-15 18:47:39 +05:00
Denys Smirnov
7b4564aec5 model: clarify the usage of width map and ttf text encoder 2018-12-15 18:47:39 +05:00
Denys Smirnov
11081b20c5 fonts: clarify cid to gid mapping 2018-12-15 18:47:39 +05:00
Denys Smirnov
e07fa3b2c0 model: add a reference to width table format and simplify the code 2018-12-15 18:47:39 +05:00
Denys Smirnov
7e2a987f8a model: remove unused font width index 2018-12-15 18:47:39 +05:00
Denys Smirnov
2274cbdf8c fonts: add a function to make a text encoder from ttf font 2018-12-15 18:47:39 +05:00
Gunnsteinn Hall
1eed6fa36f
Merge pull request #267 from dennwc/linter
Fix code style issues
2018-12-12 10:30:15 +00:00