unipdf

mirror of https://github.com/unidoc/unipdf.git synced 2025-05-01 22:17:29 +08:00

Author	SHA1	Message	Date
Samuel Stauffer	5f19bfa269	Address comments on PR	2020-01-06 11:13:16 -08:00
Samuel Stauffer	e85397b57a	Unify and optimize number parsing	2020-01-06 11:05:42 -08:00
Adrian-George Bostan	23aec77478	Add basic support for UTF-16 text encodings (#203 ) * Add UTF-16 text encoder	2019-11-28 00:47:00 +00:00
Adrian-George Bostan	56e81d3a1a	Take decode arrays into account when processing grayscale images (#159 ) * Take decode arrays into account when processing grayscale images * Adapt image extraction test case hashes * Minor refactoring in the ColorAt image method * Always return vanilla data from the jbig2 decoder	2019-08-30 19:16:23 +00:00
Jacek Kucharczyk	24648f4481	Issue #144 Fix - JBIG2 - Changed integer variables types (#148 ) * Fixing platform indepenedent integer size * Cleared test logs. * Cleared unnecessary int32 * Defined precise integer size for jbig2 segments.	2019-08-29 19:12:18 +00:00
Adrian-George Bostan	febf633172	Image memory optimizations (#149 ) * Add ColorAt method for images * Avoid resample on image to Go image conversion * Avoid resample when converting grayscale image to RGB * Preserve old behavior of image to Go image conversion * Add missing case in the ToGoImage method * Fix grayscale to RGB image conversion * Improve code documentation * Fix color extraction for CMYK and 4 bit RGB * Add test case for the ColorAt image method * Avoid resampling when converting CMYK image to RGB * Add notice comment for the GetSamples/SetSamples image methods	2019-08-22 20:15:16 +00:00
Adrian-George Bostan	cca04199e6	Add extract images test case, with memory profiling (#146 ) * Add extract images test case, with memory profiling * Use TotalAlloc insted of Alloc for memory profiling * Remove calls to debug.FreeOSMemory from test cases	2019-08-19 22:37:16 +00:00
Peter Williams	9ebcfcf168	Finding bounding boxes of substrings of extracted text. (#109 ) * Added text bounding box extraction. * Add `font` field to textMark struct; Create a new method `TextComponents` to retrieve all the text components of the extracted text in the page, with position and character informations * Reorganizing extractor/text.go * Added a text extraction position test. * Added another text extraction location test. * Text extraction location testing. * Added tests for text extraction with location information. * Cleaned up text extraction tests. No changes to functionality. * Simplifying text extraction code. * Simplified line construction in text.go * Returning TextMark's in TextMarkArray which are based on PdfObjectArray but read-only, so not pointers. * Added text extraction to show PDFs marked-up with bounding boxes of substring in extracted text. * Add comments explaining how to calculate text bounding boxes. * Made text_test.go naming consistent with function comments in text.go * Use tm, pt, tl for textMark/TextMark PageText and TextLine receivers and local variables. * uncommeted text stress test. Use go test --short to skip * TextMark.Offset is now an index into the extracted text. It was an index into []rune(text)	2019-07-18 06:41:47 +00:00
Jacek Kucharczyk	4b1c345214	JBIG2 decoder benchmark patch	2019-07-16 15:40:22 +00:00
Jacek Kucharczyk	e85616cec2	JBIG2Decoder implementation (#67 ) * Prepared skeleton and basic component implementations for the jbig2 encoding. * Added Bitset. Implemented Bitmap. * Decoder with old Arithmetic Decoder * Partly working arithmetic * Working arithmetic decoder. * MMR patched. * rebuild to apache. * Working generic * Decoded full document * Decoded AnnexH document * Minor issues fixed. * Update README.md * Fixed generic region errors. Added benchmark. Added bitmap unpadder. Added Bitmap toImage method. * Fixed endofpage error * Added integration test. * Decoded all test files without errors. Implemented JBIG2Global. * Merged with v3 version * Fixed the EOF in the globals issue * Fixed the JBIG2 ChocolateData Decode * JBIG2 Added license information * Minor fix in jbig2 encoding. * Applied the logging convention * Cleaned unnecessary imports * Go modules clear unused imports * checked out the README.md * Moved trace to Debug. Fixed the build integrate tag in the document_decode_test.go * Applied UniPDF Developer Guide. Fixed lint issues. * Cleared documentation, fixed style issues. * Added jbig2 doc.go files. Applied unipdf guide style. * Minor code style changes. * Minor naming and style issues fixes. * Minor naming changes. Style issues fixed. * Review r11 fixes. * Integrate jbig2 tests with build system * Added jbig2 integration test golden files. * Minor jbig2 integration test fix * Removed jbig2 integration image assertions * Fixed jbig2 rowstride issue. Implemented jbig2 bit writer * Changed golden files logic. Fixes r13 issues.	2019-07-14 21:18:40 +00:00
Adrian-George Bostan	d8dcc051b3	Fix annotation flatten when AcroForm does not exist (#93 ) * Fix annotation flatten when AcroForm does not exist. * Adapt test case file hashes to account for file flattening	2019-06-25 19:29:03 +00:00
Gunnsteinn Hall	7a9a8ff542	Add FDF merge test case for form filling and flattening with change detection (#98 ) Manually verified that output PDFs look good and leave hash check to detect change. If there is a change in the future, the hash change will trigger a failure upon which the output PDFs need to be re-checked and hashes updated if appropriate.	2019-06-25 08:08:51 +00:00
Adrian-George Bostan	8425bf7c8f	Update page resources Font dictionary when applying license information (#5 ) * Make PdfObjectDictionary Merge method chainable * Update page resources Font dictionary when applying license information * Add license font to the page resources only when it does not exist * Update hash for split test after verification	2019-05-30 10:52:05 +00:00
Adrian-George Bostan	c64812093d	Remmove pdf folder and move packages up one level (#2 )	2019-05-16 20:44:51 +00:00

14 Commits