72 Commits

Author SHA1 Message Date
Gunnsteinn Hall
1f56c18454 Address review comments 2018-12-07 10:32:49 +00:00
Peter Williams
835f329c28 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into extract.text 2018-12-02 10:02:16 +11:00
Peter Williams
9c258551ad Documented font code. Fall back to StandardEncoding when no encoding is speficied for a font. 2018-12-02 09:14:58 +11:00
Gunnsteinn Hall
2b1c796a74 Addressing review comments 2018-11-30 23:01:04 +00:00
Gunnsteinn Hall
283c9bf778 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into v3-peterwilliams97-extract.text.take2 2018-11-30 17:05:49 +00:00
Gunnsteinn Hall
33843599f2 Another round of addressing review comments 2018-11-30 16:53:48 +00:00
Peter Williams
785a83e866 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into extract.text
NOTE: Fixed a text_test.go regression by modifying getCharCodeMetrics().
2018-11-30 10:46:33 +11:00
Gunnsteinn Hall
f04f83b271 Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into v3-peterwilliams97-extract.text 2018-11-28 23:33:31 +00:00
Gunnsteinn Hall
520ab09a72 Addressing review comments 2018-11-28 23:25:17 +00:00
Peter Williams
f373881a48 Removed some unused struct fields. 2018-11-27 13:37:12 +11:00
Peter Williams
478c5dfe56 removed debug code 2018-11-26 17:45:41 +11:00
Peter Williams
536c688001 Fixed orientation handling in text extraction. 2018-11-26 17:17:17 +11:00
Peter Williams
6e5e32dd92 Fixed encoding selection for standard 14 fonts. 2018-11-22 22:01:04 +11:00
Peter Williams
dcb2b14d55 Handle standard 14 TrueType fonts and stanard 14 font aliases in text extraction. 2018-11-20 17:49:37 +11:00
Peter Williams
cad144cec3 Handle missing widths in text extraction 2018-11-20 15:49:28 +11:00
Peter Williams
2f8b50af75 Fixed landscape rotation for text extraction.
Also compute metrics for standard 14 fonts when not created from dict.
2018-11-19 16:50:28 +11:00
Peter Williams
a9019a50a3 Fixes for text extraction corpus testing.
- Correct matrix multiplication order in text.go
- Look up standard 14 font widths after applying custom encoding.
2018-11-18 17:21:30 +11:00
Peter Williams
a1d5e8dc45 Cleaned up some comments. 2018-11-10 21:41:47 +11:00
Peter Williams
a2342ec6c6 First attempt at getting font metrics by character code. 2018-11-08 15:20:12 +11:00
Peter Williams
3da4ffc5aa Merge 2018-11-01 21:33:51 +11:00
Peter Williams
b0c440dd00 Fixed text position tracking. 2018-10-30 21:55:30 +11:00
Peter Williams
45f6c09e39 Merge branch 'render.v3.hungarian' into extract 2018-10-19 10:05:02 +11:00
Peter Williams
b48010c75b Fixed typo 2018-10-18 21:39:16 +11:00
Peter Williams
45228219b5 Added PdfFont.FontDescriptor() which always returns a PdfFontDescriptor, possibly a builtin one for
the standard 14 fonts.
2018-10-18 21:12:15 +11:00
Peter Williams
2452973cfe Don't add /Encoding entry to standard 14 font dicts.
Moved the standard 14 font encoders to a separate field pdfFontSimple.std14Encoder.
2018-10-16 14:50:43 +11:00
Gunnsteinn Hall
f4deb858ba Fix for loading standard fonts with Encoding difference maps 2018-10-09 18:14:34 +11:00
Peter Williams
24d522bdb2 Merge branch 'v3' of https://github.com/unidoc/unidoc into extract 2018-09-24 15:25:44 +10:00
Peter Williams
c76fa6985e Moved font cache from global variable to Extractor. 2018-09-22 09:28:18 +10:00
Peter Williams
75dfdb6f1c Use Standard14Font consistently for standard 14 font names. 2018-09-21 16:25:57 +10:00
Peter Williams
dc6f50aa93 Improvements to text extraction. 2018-09-20 11:49:44 +10:00
Peter Williams
44563f2cae Added fontMetrics to font loader and GetAverageCharWidth to PdfFont 2018-09-19 11:12:59 +10:00
Peter Williams
b18c8ca93d Add ToUnicode map when embedding Type0 CIDType2 fonts in PDF files. 2018-09-17 17:57:52 +10:00
Peter Williams
4d5156c4a0 Added NewStandard14FontMustCompile 2018-09-07 19:11:58 +10:00
Peter Williams
f84792531f Fixed bugs introduced into creator_test.go by font changes.
- Use pdfFontType0 Encoding value for encoding name if it is set.
- Use DW entry in CID Type2 fonts
- Encode CID fonts using 2 bytes / character
2018-09-03 10:48:31 +10:00
Peter Williams
660c1b934b added change left out of previous commit 2018-08-22 15:05:34 +10:00
Peter Williams
e5ec5406f6 Changes to get creator working better.
- textencoding.RuneToGlyph always returns a value
- Encode empty /Difference correctly
- NewParagraph sets a custom encoding that matches its text
2018-08-22 14:51:50 +10:00
Peter Williams
cc4f64fa98 small tidy up 2018-08-21 12:43:51 +10:00
Peter Williams
8b6a14a2f9 Added a function to create an encoder for a specified alphabet for simple fonts. 2018-08-20 17:58:01 +10:00
Peter Williams
e886846c6a Changes after pull request review 2018-07-24 21:32:02 +10:00
Peter Williams
e2b4f908bd removed panics 2018-07-23 17:14:42 +10:00
Hiroshi Muramatsu
5257855e29 Define font descriptor flags 2018-07-23 13:21:13 +10:00
Peter Williams
502836666d Merge remote-tracking branch 'upstream/v3' into render.v3 2018-07-21 21:20:39 +10:00
Peter Williams
2468e2b264 Merge branch 'render' into render.clean 2018-07-21 14:18:48 +10:00
Peter Williams
28d2d223c4 Reduced logging noise 2018-07-21 08:53:59 +10:00
Peter Williams
c489d4630c Added some logging 2018-07-21 08:43:03 +10:00
Peter Williams
7f5475badb attempting to simplify render branch 2018-07-16 17:42:08 +10:00
Peter Williams
53209c7170 unpack ligatures 2018-07-16 08:09:23 +10:00
Peter Williams
6582182078 reduced differences with compositefont branch 2018-07-15 16:28:56 +10:00
Peter Williams
bc1e9ae7b5 Refactored font code to improve text extraction 2018-07-13 17:40:27 +10:00
Hiroshi Muramatsu
299f65df69 Remove unnecessary argument 2018-07-11 09:04:17 +10:00