Adrian-George Bostan
550d79472e
Add option to generate appearance dicts when filling form fields ( #128 )
2019-07-22 17:19:02 +00:00
Peter Williams
9ebcfcf168
Finding bounding boxes of substrings of extracted text. ( #109 )
...
* Added text bounding box extraction.
* Add `font` field to textMark struct;
Create a new method `TextComponents` to retrieve all the text components of the extracted text in the page, with position and character informations
* Reorganizing extractor/text.go
* Added a text extraction position test.
* Added another text extraction location test.
* Text extraction location testing.
* Added tests for text extraction with location information.
* Cleaned up text extraction tests. No changes to functionality.
* Simplifying text extraction code.
* Simplified line construction in text.go
* Returning TextMark's in TextMarkArray which are based on PdfObjectArray but read-only, so not pointers.
* Added text extraction to show PDFs marked-up with bounding boxes of substring in extracted text.
* Add comments explaining how to calculate text bounding boxes.
* Made text_test.go naming consistent with function comments in text.go
* Use tm, pt, tl for textMark/TextMark PageText and TextLine receivers and local variables.
* uncommeted text stress test. Use go test --short to skip
* TextMark.Offset is now an index into the extracted text. It was an index into []rune(text)
2019-07-18 06:41:47 +00:00
Jacek Kucharczyk
4b1c345214
JBIG2 decoder benchmark patch
2019-07-16 15:40:22 +00:00
Jacek Kucharczyk
e85616cec2
JBIG2Decoder implementation ( #67 )
...
* Prepared skeleton and basic component implementations for the jbig2 encoding.
* Added Bitset. Implemented Bitmap.
* Decoder with old Arithmetic Decoder
* Partly working arithmetic
* Working arithmetic decoder.
* MMR patched.
* rebuild to apache.
* Working generic
* Decoded full document
* Decoded AnnexH document
* Minor issues fixed.
* Update README.md
* Fixed generic region errors. Added benchmark. Added bitmap unpadder. Added Bitmap toImage method.
* Fixed endofpage error
* Added integration test.
* Decoded all test files without errors. Implemented JBIG2Global.
* Merged with v3 version
* Fixed the EOF in the globals issue
* Fixed the JBIG2 ChocolateData Decode
* JBIG2 Added license information
* Minor fix in jbig2 encoding.
* Applied the logging convention
* Cleaned unnecessary imports
* Go modules clear unused imports
* checked out the README.md
* Moved trace to Debug. Fixed the build integrate tag in the document_decode_test.go
* Applied UniPDF Developer Guide. Fixed lint issues.
* Cleared documentation, fixed style issues.
* Added jbig2 doc.go files. Applied unipdf guide style.
* Minor code style changes.
* Minor naming and style issues fixes.
* Minor naming changes. Style issues fixed.
* Review r11 fixes.
* Integrate jbig2 tests with build system
* Added jbig2 integration test golden files.
* Minor jbig2 integration test fix
* Removed jbig2 integration image assertions
* Fixed jbig2 rowstride issue. Implemented jbig2 bit writer
* Changed golden files logic. Fixes r13 issues.
2019-07-14 21:18:40 +00:00
Gunnsteinn Hall
0460471691
Add UNIDOC_JBIG2_TESTDATA environment variable pointing to testdata directory ( #120 )
...
Needed to be able to run tests in PR #67
2019-07-14 20:18:57 +00:00
Gunnsteinn Hall
0668159af1
Optimize: Use original if smaller than "compressed" ( #118 )
...
* Optimize: Use smallest image. Addresses #51 .
2019-07-11 20:24:46 +00:00
Adrian-George Bostan
e68b7c664b
Avoid unnecessary allocations when converting gray scale image to RGB ( #117 )
2019-07-10 21:13:49 +00:00
Adrian-George Bostan
2f4fe8cfd2
Resolve references when adding page to writer from a lazy reader ( #115 )
2019-07-10 04:18:34 +00:00
Adrian-George Bostan
2689453ae1
Resolve page parents when adding page to writer ( #110 )
2019-07-05 16:02:25 +00:00
Adrian-George Bostan
2189002435
Allow adding an external outline tree to the creator ( #106 )
2019-07-04 18:57:34 +00:00
Peter Williams
3de3705eb2
Fixed CalRGB -> RRB image conversion. ( #103 )
2019-07-03 13:35:15 +00:00
Gunnsteinn Hall
9539b4e8f3
Merge pull request #104 from adrg/creator-page-resources
...
Add resources of blocks created from pages to the output page resources
2019-07-02 19:34:16 +00:00
Adrian-George Bostan
6b13044546
Add resources of blocks created from pages to the output page resources
2019-07-02 21:51:56 +03:00
Adrian-George Bostan
13e08e064c
Skip invalid outline nodes ( #101 )
...
* Skip invalid outline nodes when building outline tree
* Add methods for accessing and writing named destinations
2019-06-27 20:08:40 +00:00
Adrian-George Bostan
f5989ea574
Update go dep file ( #100 )
2019-06-25 20:03:42 +00:00
Adrian-George Bostan
d8dcc051b3
Fix annotation flatten when AcroForm does not exist ( #93 )
...
* Fix annotation flatten when AcroForm does not exist.
* Adapt test case file hashes to account for file flattening
2019-06-25 19:29:03 +00:00
Gunnsteinn Hall
7a9a8ff542
Add FDF merge test case for form filling and flattening with change detection ( #98 )
...
Manually verified that output PDFs look good and leave hash check to detect change. If there is a change in the future, the hash change will trigger a failure upon which the output PDFs need to be re-checked and hashes updated if appropriate.
2019-06-25 08:08:51 +00:00
Adrian-George Bostan
2227f4f372
Resolve page Resources references on writer page add, if page reader is lazy ( #97 )
2019-06-24 20:07:15 +00:00
Gunnsteinn Hall
2daa144856
Merge pull request #91 from adrg/xref-table-invalid-line
...
Attempt to parse invalid beginning lines of xref table subsections.
2019-06-14 06:33:52 +00:00
Gunnsteinn Hall
756f5e2886
Merge pull request #90 from adrg/form-field-metadata
...
Skip invalid Metadata stream in form field Kids array
2019-06-14 06:07:28 +00:00
Gunnsteinn Hall
779768607b
Merge pull request #89 from adrg/parser-unexpected-pattern
...
Handle improper usage of the array ending marker
2019-06-14 06:06:57 +00:00
Adrian-George Bostan
9dec1cdc9e
Attempt to parse invalid beginning lines of xref table subsections.
2019-06-13 21:50:21 +03:00
Adrian-George Bostan
05eee50b4c
Skip invalid Metadata stream in form field Kids array
2019-06-13 20:37:14 +03:00
Adrian-George Bostan
4d43867b1f
Handle improper usage of the array ending marker
2019-06-12 22:12:58 +03:00
Gunnsteinn Hall
a60d343750
Merge branch 'release/v3.0.2' into development
2019-06-12 06:20:38 +00:00
Gunnsteinn Hall
6186242ee6
Merge remote-tracking branch 'upstream/master' into release/v3.0.2
2019-06-12 06:02:52 +00:00
Gunnsteinn Hall
75df2dcce9
Update version.go for release v3.0.2
2019-06-11 22:08:32 +00:00
Adrian-George Bostan
4bfc25d0ff
Parse EOF markers missing the F character ( #86 )
2019-06-11 22:03:12 +00:00
Adrian-George Bostan
441e9a3177
Attempt to identify Pages nodes without Type ( #85 )
2019-06-11 22:02:36 +00:00
Adrian-George Bostan
4290e33c36
Skip loading outlines on invalid outline root node ( #84 )
2019-06-10 19:16:03 +00:00
Gunnsteinn Hall
c3a24925ff
License handling, expiry ( #82 )
2019-06-08 10:37:54 +00:00
Adrian-George Bostan
6e33703379
Take references with negative object numbers into account ( #77 )
2019-06-08 07:58:57 +00:00
Adrian-George Bostan
91d0d77b34
Consider files not encrypted when Encrypt object is null ( #80 )
2019-06-07 22:28:14 +00:00
Adrian-George Bostan
7e56b89e18
Fix page resources not being loaded from parent nodes ( #78 )
2019-06-07 21:59:14 +00:00
Adrian-George Bostan
6cd5c83f2b
Fix parsing names with containing the # character ( #73 )
2019-06-05 23:28:06 +00:00
Gunnsteinn Hall
9bc50d0564
Merge pull request #72 from adrg/flate-encoder-empty-buffer
...
Check for empty encoded byte buffer on Flate decode
2019-06-03 19:07:15 +00:00
Adrian-George Bostan
bc5836551e
Check for empty encoded byte buffer on Flate decode
2019-06-03 18:18:45 +03:00
Gunnsteinn Hall
a9d8725810
Merge master into development ( #71 )
2019-06-02 16:21:07 +00:00
Gunnsteinn Hall
112c39a889
Merge pull request #70 from gunnsth/release-v3.0.1
...
Release v3.0.1
v3.0.1
2019-06-02 15:25:25 +00:00
Gunnsteinn Hall
45bef6de26
Update README.md
2019-06-02 15:06:28 +00:00
Gunnsteinn Hall
33945e7779
Prepare for release. Update version.go and README. Fix testcase .
2019-06-02 14:59:15 +00:00
Adrian-George Bostan
1dcde7d76e
Add support for vertical alignment of styled paragraphs inside table cells ( #69 )
2019-06-02 14:57:41 +00:00
Adrian-George Bostan
7f4e3b8f13
Update page resources Font dictionary when applying license information ( #5 )
...
* Make PdfObjectDictionary Merge method chainable
* Update page resources Font dictionary when applying license information
* Add license font to the page resources only when it does not exist
* Update hash for split test after verification
2019-06-02 14:57:41 +00:00
Adrian-George Bostan
3bae2f6035
Add support for vertical alignment of styled paragraphs inside table cells ( #69 )
2019-05-30 22:01:56 +00:00
Adrian-George Bostan
8425bf7c8f
Update page resources Font dictionary when applying license information ( #5 )
...
* Make PdfObjectDictionary Merge method chainable
* Update page resources Font dictionary when applying license information
* Add license font to the page resources only when it does not exist
* Update hash for split test after verification
2019-05-30 10:52:05 +00:00
Gunnsteinn Hall
326742a81f
Update README.MD
2019-05-24 00:44:40 +00:00
Gunnsteinn Hall
bc0edcb8dd
Update issue templates
v3.0.0
2019-05-20 10:18:26 +00:00
Gunnsteinn Hall
37d822bead
unipdf move updates ( #6 )
...
* Update README and other documentation for unipdf
* Update Jenkinsfile
* Get test dependencies too (Jenkinsfile processing not fully module based yet)
* Update wercker file for modules
2019-05-19 12:41:13 +00:00
Adrian-George Bostan
c64812093d
Remmove pdf folder and move packages up one level ( #2 )
2019-05-16 20:44:51 +00:00
Adrian-George Bostan
8acac88784
Update module version and import paths ( #1 )
...
* Update import path to use unipdf
* Update module name and version
2019-05-16 20:08:40 +00:00