Denys Smirnov
ac7696693b
fonts: describe few issues with the code; remove unused cmap type
2019-01-01 17:19:58 +02:00
Denys Smirnov
53687f854e
Merge remote-tracking branch 'origin/v3' into extract.text
...
# Conflicts:
# pdf/contentstream/processor.go
# pdf/extractor/text.go
# pdf/extractor/utils.go
# pdf/internal/textencoding/winansi.go
# pdf/model/font.go
# pdf/model/font_composite.go
# pdf/model/font_simple.go
# pdf/model/font_test.go
# pdf/model/fontfile.go
# pdf/model/fonts/ttfparser.go
# pdf/model/structures.go
2018-12-27 12:17:28 +02:00
Denys Smirnov
11081b20c5
fonts: clarify cid to gid mapping
2018-12-15 18:47:39 +05:00
Denys Smirnov
2274cbdf8c
fonts: add a function to make a text encoder from ttf font
2018-12-15 18:47:39 +05:00
Denys Smirnov
0a8b46daff
don't use generic receiver names; make sure receiver name is consistent
2018-12-09 21:47:15 +02:00
Denys Smirnov
9f0df8945d
don't use XXX for TODOs
2018-12-09 21:39:11 +02:00
Denys Smirnov
6d2c39043c
make sure comments begin with a type/function name
2018-12-09 20:22:33 +02:00
Denys Smirnov
99f3184879
define slices with a var instead of an empty literal
2018-12-09 19:28:50 +02:00
Denys Smirnov
7cdbb0c572
Merge remote-tracking branch 'origin/v3' into extract.text
...
# Conflicts:
# pdf/internal/textencoding/truetype.go
# pdf/model/font.go
# pdf/model/font_composite.go
# pdf/model/font_simple.go
# pdf/model/font_test.go
# pdf/model/fonts/ttfparser.go
2018-12-07 18:30:37 +02:00
Peter Williams
835f329c28
Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into extract.text
2018-12-02 10:02:16 +11:00
Peter Williams
9c258551ad
Documented font code. Fall back to StandardEncoding when no encoding is speficied for a font.
2018-12-02 09:14:58 +11:00
Gunnsteinn Hall
2b1c796a74
Addressing review comments
2018-11-30 23:01:04 +00:00
Gunnsteinn Hall
33843599f2
Another round of addressing review comments
2018-11-30 16:53:48 +00:00
Denys Smirnov
fb4a087a93
textencoding: introduce GlyphName type
2018-11-29 23:24:40 +02:00
Denys Smirnov
7c8d88185c
fonts: assert type of another map; add some comments
2018-11-29 04:30:37 +02:00
Denys Smirnov
46d22eac31
fonts: introduce types for GIDs and char codes; fix shadowing bug
2018-11-29 04:19:29 +02:00
Denys Smirnov
ab62ff5060
fonts: specify rune type as a key for Chars and runeToWidth
2018-11-29 04:19:29 +02:00
Denys Smirnov
6c0fd1e780
cmap: mapped values are runes, not strings
2018-11-29 04:19:29 +02:00
Peter Williams
92e3e455c2
Merge branch 'v3' of https://github.com/unidoc/unidoc into extract
2018-11-22 22:03:26 +11:00
Peter Williams
8b964f2008
Set font even when Tf operator is not between BT and ET.
2018-11-21 13:14:11 +11:00
Peter Williams
cad144cec3
Handle missing widths in text extraction
2018-11-20 15:49:28 +11:00
Denys Smirnov
86a30df78c
fonts: floats should be signed
2018-11-17 15:03:34 +01:00
Denys Smirnov
c8c7a03896
fonts: fix glyph id bounds check
2018-11-07 22:09:57 +02:00
Denys Smirnov
08c1fe4ed4
fonts: remove unused field
2018-11-07 22:09:57 +02:00
Peter Williams
3da4ffc5aa
Merge
2018-11-01 21:33:51 +11:00
Peter Williams
5e8ca9c18c
Fixed code->glyph mapping for TrueType fonts for raw number gid
2018-10-29 09:08:32 +11:00
Gunnsteinn Hall
aea91f1ba9
Merge branch 'v3' into v3-enhance-forms
2018-09-29 16:59:16 +00:00
Peter Williams
f953c11452
Don't return errors for TrueType font file tables with no PostScript entry in their "name" table.
...
This is needed for PDFs created with Tesseract.
2018-09-24 18:02:02 +10:00
Peter Williams
b0f5329425
Allow TrueType font files to not have PostScript entries in their "name" table.
2018-09-24 17:53:12 +10:00
Peter Williams
69be54d501
Cleaned up some comments.
2018-09-21 16:43:10 +10:00
Peter Williams
b18c8ca93d
Add ToUnicode map when embedding Type0 CIDType2 fonts in PDF files.
2018-09-17 17:57:52 +10:00
Peter Williams
b7f1f3e291
Merge branch 'v3' of https://github.com/unidoc/unidoc into render.v3.hungarian
2018-08-22 22:01:00 +10:00
Peter Williams
c2feafdfdc
Fixed some issues in creator code
...
Stopped double converting from Go strings to PDF encoded strings
Added TTF parse table format 12
2018-08-17 08:41:35 +10:00
Peter Williams
d64785a8ca
Added more font tests
2018-08-14 21:28:57 +10:00
Gunnsteinn Hall
7bac3c779c
Merge branch 'v3' into enhance-forms
2018-08-03 21:15:21 +00:00
Gunnsteinn Hall
6c34f32c7f
Updating headers and package descriptions
2018-08-03 10:15:42 +00:00
Peter Williams
08c3211590
Refactored simple textencoding
...
Made GlyphToCode work for all tables
Moved more aliases into glyphAliases rather than leaving the duplicates in the base maps.
Use SimpleEncoder explictly for simple fonts
2018-07-31 11:52:24 +10:00
Peter Williams
b1cf3494f7
Removed naked returns. Fixed godoc. Reorganized object extractors
2018-07-25 12:00:49 +10:00
Peter Williams
e886846c6a
Changes after pull request review
2018-07-24 21:32:02 +10:00
Peter Williams
879b07df16
Added a test for CharcodeBytesToUnicode for Type0 ToUnicode cmaps
2018-07-19 10:28:23 +10:00
Peter Williams
6582182078
reduced differences with compositefont branch
2018-07-15 16:28:56 +10:00
Peter Williams
ae87dc79f3
keep going when FontFile2 encoding is empty
2018-07-13 21:15:03 +10:00
Peter Williams
bc1e9ae7b5
Refactored font code to improve text extraction
2018-07-13 17:40:27 +10:00
Peter Williams
199a74dbd8
Major changes to font code
...
- Added Type1 font parsing.
- Added Standard 14 font parsing.
- Fixed some bugs in cmap code.
- Started re-structuring of font code. Moved common font fields to `fontSkeleton`
2018-06-27 12:25:59 +10:00
Gunnsteinn Hall
646329ff21
Initial support for composite fonts (Type0 and CIDFontType2).
...
Simplified creator paragraph handling of text encoding.
Character codes expanded to 16bit instead of 8bit.
2017-09-01 13:20:51 +00:00
Gunnsteinn Hall
1a5c3eb4ac
Initial import of PDF creator with text, image adding capabilities
2017-07-05 23:10:57 +00:00