1705 Commits

Author SHA1 Message Date
Peter Williams
e6be02163c Merge branch 'development' of https://github.com/unidoc/unipdf into columns 2020-06-15 10:42:21 +10:00
Peter Williams
975e03811f Removed funny character from text_test.go 2020-06-15 10:41:49 +10:00
Adrian-George Bostan
99ef1b861d
Combo field appearance (#370)
* Fix combo field appearances not being shown

* Fix V object type for choice and button fields

* Refactor form fill for combo and checkbox fields

* Add fill test case for text, combo and checkbox fields

* Prevent panic when flattening forms using a nil appearance generator
2020-06-10 16:58:00 +00:00
Adrian-George Bostan
6cb58f6327
Add configurable font fallback options for form fields (#368)
* Add configurable fallback font support for form fill/flatten

* Add appearance font to AcroForm DR

* Refactor DA process method

* Remove unnecessary font default size variable

* Minor refactor in the appearance generation functions

* Improve processDA appearance style method

* Use original font container if present in DR

* Maintain original appearance font autosizing behavior
2020-06-09 15:16:54 +00:00
Adrian-George Bostan
6b8d5c42f7
Fix outline null object check (#367) 2020-06-05 11:46:55 +00:00
Peter Williams
b4d90b6402 Absorb text to the left of paras into paras e.g. Footnote numbers 2020-06-05 21:43:09 +10:00
Peter Williams
30fc953954 Check for textParas that are on the same line when writing out extracted text. 2020-06-05 15:44:31 +10:00
Peter Williams
16b3c1c450 Removed commented code 2020-06-05 14:21:53 +10:00
Peter Williams
af9508cc5c Added tests for columns extraction. 2020-06-05 14:01:31 +10:00
Peter Williams
29f2d9b8cf Merge branch 'development' of https://github.com/unidoc/unipdf into columns 2020-06-05 11:43:04 +10:00
Peter Williams
5777ee1394
Handle multibyte entries in CMaps. (#353)
* Fixed filename:page in logging

* Got CMap working for multi-rune entries

* Treat CMap entries as strings instead of runes to handle multi-byte encodings.

* Added a test for multibyte encoding.

* Changed rune->CharCode maps to string->CharCode.

* Removed unintentional changes.

* Updated comments to match new function definitions.

* Changed some []rune APIs to string

* Fixes for reviewer comments.
2020-06-03 13:55:15 +00:00
Peter Williams
40806d7f96 Adding tables to text extractor. 2020-06-01 14:04:32 +10:00
Gunnsteinn Hall
4508e17036
Merge pull request #364 from adrg/flatten-text-field-rotation
Account for rotation when generating flattened text field appearances
2020-05-29 17:35:58 +00:00
Adrian-George Bostan
d6e1cb5761 Account for rotation when generating flattened text field appearances 2020-05-29 17:49:00 +03:00
Peter Williams
49bbef0442 More verbose logging 2020-05-29 08:58:23 +10:00
Peter Williams
a14d8e73d8 Fixed typo 2020-05-28 12:10:49 +10:00
Peter Williams
2260e245f7 Handle more cases of fonts not being set in text extraction code. 2020-05-28 12:08:15 +10:00
Peter Williams
418f859d44 Reinstated hyphen suppression 2020-05-27 21:11:47 +10:00
Peter Williams
d21e2f83c4 Got text_test.go passing. 2020-05-27 18:15:18 +10:00
Peter Williams
6b4314f97c Adjusted inter-word search distance to make paragrah division work for thanh.pdf 2020-05-26 18:53:23 +10:00
Peter Williams
fad1552009 Fixed text state save/restore. 2020-05-26 13:26:09 +10:00
Adrian-George Bostan
d078608da4
Account for parent CTM when calculating positions of extracted forms (#349)
* Take parent CTM into account for form field text

* Pass a modified  graphics state instance to new text objects
2020-05-25 23:34:44 +00:00
Peter Williams
603b5ff4e7 Added function comments. 2020-05-25 14:00:00 +10:00
Peter Williams
c515472849 Abstracted textWord depth calculation. This required change textMark to *textMark in a lot of code. 2020-05-25 09:39:30 +10:00
Peter Williams
83033182fa fixed typos 2020-05-24 21:23:33 +10:00
Peter Williams
a5c538f420 Added an expanation of the text columns code to README.md. 2020-05-24 21:16:48 +10:00
Peter Williams
6b13a99b82 First version of text extraction that recognizes columns 2020-05-24 21:00:37 +10:00
Peter Williams
e9c46fa3b9 Merge branch 'cmap' into columns 2020-05-24 20:45:31 +10:00
Adrian-George Bostan
5efaa02e23
Use page indirect object for internal outline destinations (#359)
* Use page indirect object for internal outlines

* Use page indirect object in creator outline destinations

* Adapt creator test case to test outline creation and retrieval
2020-05-22 16:19:43 +00:00
Adrian-George Bostan
033f410eac
Account for inverted annotation rects when calculation appearance bounds (#357) 2020-05-20 17:58:54 +00:00
Adrian-George Bostan
d2941b5477
Add reader method for checking if the AcroForm needs repair (#356)
* Add AcroFormNeeds repair method

* Add AcroForm repair check test case
2020-05-20 16:04:02 +00:00
Peter Williams
6103fb8ea3 Merge branch 'development' of https://github.com/unidoc/unipdf into cmap 2020-05-20 19:40:30 +10:00
Peter Williams
0c54cec2c5 Added a test for multibyte encoding. 2020-05-20 19:07:22 +10:00
Peter Williams
a9910e7e06 Treat CMap entries as strings instead of runes to handle multi-byte encodings. 2020-05-20 18:43:09 +10:00
Adrian-George Bostan
80d51c5532
Add reader AcroForm repair functionality (#351)
* Add method for retrieving widget parent form field

* Add reader method for repairing AcroForm

* Add AcroForm repair test case

* Add AcroForm repair options

* RepairAcroForm documentation improvements
2020-05-19 12:42:07 +00:00
Peter Williams
22680be097 Got CMap working for multi-rune entries 2020-05-19 14:57:27 +10:00
Peter Williams
6fe0d20a86 Fixed filename:page in logging 2020-05-19 11:46:51 +10:00
Adrian-George Bostan
6246921ab3
Fix table right-aligned content (#348)
* Fix cell content width in right-aligned table cells

* Add table horizontal alignment test case

* Fix import style
2020-05-12 15:47:52 +00:00
Gunnsteinn Hall
ad2a1e9c9d
Subsetting fixes (#346)
* Update unitype lib which improves subsetting

* Add text extraction check to creator font subsetting example

Helps ensure ToUnicode map is set correctly.

* Clean up import

* Fix spelling
2020-05-12 07:15:09 +00:00
Adrian-George Bostan
aef6e5e976
Fix CMap generation and serialization for composite fonts (#344)
* Fix CMap charcode mapping serialization

* Improve CMap generation in the NewCompositePdfFontFromTTF function
2020-05-08 00:15:09 +00:00
Adrian-George Bostan
f60e313cdb
Fix incorrect render of invoice totals (#342)
* Fix rendering invoice totals area on page breaks

* Improve invoice simple test case
2020-05-06 20:12:44 +00:00
Gunnsteinn Hall
9ef2f27694
Support for subsetting fonts (#335)
* Subsetting of TrueType CID fonts using unitype

* Simplify call to SubsetRegistered so can be done right after loading font via creator finalizer

* Add an EnableFontSubsetting function on the creator to simplify font subsetting for creator users
2020-05-05 00:17:27 +00:00
Gunnsteinn Hall
f445a10391
Merge pull request #336 from unidoc/master
Master into development
2020-05-02 14:23:01 +00:00
Adrian-George Bostan
c17719d232
Invoice component improvements (#334)
* Add invoice address heading field

* Update invoice test cases

* Add default value to buyer address heading

* Add Street2 and State address fields

* Add configurable address separator field

* Improve invoice test cases
2020-05-01 12:56:43 +00:00
Adrian-George Bostan
30db3448f7
Add support for multi-block styled paragraphs (#331)
* Add support for multi-block styled paragraphs

* Fix context space when drawing division inside tables

* Update context height when drawing tables

* Update advanced invoice test case

* Add basic multi-block styled paragraph test case
2020-04-29 19:22:00 +00:00
Adrian-George Bostan
d84d0c4375
Form fill fixes (#328)
* Parse form fields with embedded widget annotations

* Try matching fields both by partial and full names on form fill

* Use default font if widget font is not found when generating appearance

* Add JSON extract and fill test case
2020-04-24 16:48:06 +00:00
Gunnsteinn Hall
0a9a9582e0
Merge pull request #327 from gunnsth/release/v3.6.2
Prepare unipdf release v3.6.2
v3.6.2
2020-04-23 11:24:54 +00:00
Gunnsteinn Hall
db242dbcff Update version.go for 3.6.2 2020-04-23 01:36:32 +00:00
Adrian-George Bostan
cb0166e96b
Add low level PageLabels support (#325)
* Add reader method for retriving the PageLabels entry from the catalog
* Add writer method for setting the PageLabels entry in the catalog.
* Add creator method for adding page labels for the output file
* Add creator page labels test case
* Minor page labels test case correction
2020-04-22 21:17:33 +00:00
Alexey Pavlyukov
a69d788171
Add timestamp signature handler (#301)
* Add timestamp signature handler

* Add timestamp signature handler test

* fix PR issues

* fix PR issues

* fix PR issues

* Fix

Co-authored-by: Gunnsteinn Hall <gunnsteinn.hall@gmail.com>
2020-04-22 20:21:53 +00:00