unipdf

mirror of https://github.com/unidoc/unipdf.git synced 2025-04-26 13:48:55 +08:00

Author	SHA1	Message	Date
Adrian-George Bostan	d961079c5d	Add basic image rendering support (#266 ) * Add render package * Add text state * Add more text operators * Remove unnecessary files * Add text font * Add custom text render method * Improve text rendering method * Rename text state methods * Refactor and document context interface * Refact text begin/end operators * Fix graphics state transformations * Keep original font when doing font substitution * Take page cropbox into account * Revert to substitution font if original font measurement is 0 * Add font substitution package * Implement addition transform.Point methods * Use transform.Point in the image context package * Remove unneeded functionality from the render image package * Fix golint notices in the image rendering package * Fix go vet notices in the render package * Fix golint notices in the top-level render package * Improve render context package documentation * Document context text state struct. * Document context text font struct. * Minor logging improvements * Add license disclaimer to the render package files * Avoid using package aliases where possible * Change style of section comments * Adapt render package import style to follow the developer guide * Improve documentation for the internal matrix implementation * Update render package dependency versions * Apply crop box post render * Account for offseted media boxes * Improve metrics of rendered characters * Fix text matrix translation * Change priority of fonts used for measuring rendered characters * Skip invalid m and l operators on image rendering * Small fix for v operator * Fix rendered characters spacing issues * Refactor naming of internal render packages	2020-03-02 21:22:54 +00:00
Peter Williams	e056c0e4d4	Fixed PdfColorspaceSpecialIndexed.ImageToRGB() (#259 ) * Fixed PdfColorspaceSpecialIndexed.ImageToRGB() Fixes https://github.com/unidoc/unipdf/issues/258 * Fixed indexed colorspace bounds checking. * Being super cautious to prevent a divide by zero error. I don't think the base cs can have <1 cpts. * Updated image hash in extract_images_test.go to match new indexed colorspace code. * add testfile from unipdf#258	2020-02-26 13:26:20 +00:00
Adrian-George Bostan	9de5fe644e	Add PdfFont text encoding methods (#257 ) * Add PdfFont method for encoding runes to charcode bytes * Add getter method for CMap nbits * Take CMap nbits into account when encoding text * Adapt font test cases to include text encoding testing	2020-02-17 22:54:20 +00:00
Adrian-George Bostan	e2b3c6e6ba	Add predefined CMaps for Type 0 composite fonts (#246 ) * Add packed predefined cmaps * Add cmap cid range parsing * Load base cmap for predefined cmaps * Refactor pdfFont to Unicode methods * Preserve CharcodeBytesToUnicode behavior * Add support for CID-keyed Type 0 fonts * Add method documentation for the cmap package * Refactor and document charcode to Unicode conversion code * Add more cmap parsing test cases * Add more method documentation in the cmap package. * Remove unused code from the bcmaps package * Improve cmap test case * Assume identity when encoder is missing on regenerating field appearance * Add missing encoder log message * Add inverse CMap mappings * Add CMap encoder * Address golint notices and small fix in the cmap package * Keep smaller charcodes when generating cmap inverse mappings * Update extractor test case * Keep latest supplement charcodes/CIDs when computing inverse mappings * Fix comment typo	2020-02-07 19:56:30 +00:00
Samuel Stauffer	5f19bfa269	Address comments on PR	2020-01-06 11:13:16 -08:00
Samuel Stauffer	e85397b57a	Unify and optimize number parsing	2020-01-06 11:05:42 -08:00
Adrian-George Bostan	23aec77478	Add basic support for UTF-16 text encodings (#203 ) * Add UTF-16 text encoder	2019-11-28 00:47:00 +00:00
Adrian-George Bostan	56e81d3a1a	Take decode arrays into account when processing grayscale images (#159 ) * Take decode arrays into account when processing grayscale images * Adapt image extraction test case hashes * Minor refactoring in the ColorAt image method * Always return vanilla data from the jbig2 decoder	2019-08-30 19:16:23 +00:00
Jacek Kucharczyk	24648f4481	Issue #144 Fix - JBIG2 - Changed integer variables types (#148 ) * Fixing platform indepenedent integer size * Cleared test logs. * Cleared unnecessary int32 * Defined precise integer size for jbig2 segments.	2019-08-29 19:12:18 +00:00
Adrian-George Bostan	febf633172	Image memory optimizations (#149 ) * Add ColorAt method for images * Avoid resample on image to Go image conversion * Avoid resample when converting grayscale image to RGB * Preserve old behavior of image to Go image conversion * Add missing case in the ToGoImage method * Fix grayscale to RGB image conversion * Improve code documentation * Fix color extraction for CMYK and 4 bit RGB * Add test case for the ColorAt image method * Avoid resampling when converting CMYK image to RGB * Add notice comment for the GetSamples/SetSamples image methods	2019-08-22 20:15:16 +00:00
Adrian-George Bostan	cca04199e6	Add extract images test case, with memory profiling (#146 ) * Add extract images test case, with memory profiling * Use TotalAlloc insted of Alloc for memory profiling * Remove calls to debug.FreeOSMemory from test cases	2019-08-19 22:37:16 +00:00
Peter Williams	9ebcfcf168	Finding bounding boxes of substrings of extracted text. (#109 ) * Added text bounding box extraction. * Add `font` field to textMark struct; Create a new method `TextComponents` to retrieve all the text components of the extracted text in the page, with position and character informations * Reorganizing extractor/text.go * Added a text extraction position test. * Added another text extraction location test. * Text extraction location testing. * Added tests for text extraction with location information. * Cleaned up text extraction tests. No changes to functionality. * Simplifying text extraction code. * Simplified line construction in text.go * Returning TextMark's in TextMarkArray which are based on PdfObjectArray but read-only, so not pointers. * Added text extraction to show PDFs marked-up with bounding boxes of substring in extracted text. * Add comments explaining how to calculate text bounding boxes. * Made text_test.go naming consistent with function comments in text.go * Use tm, pt, tl for textMark/TextMark PageText and TextLine receivers and local variables. * uncommeted text stress test. Use go test --short to skip * TextMark.Offset is now an index into the extracted text. It was an index into []rune(text)	2019-07-18 06:41:47 +00:00
Jacek Kucharczyk	4b1c345214	JBIG2 decoder benchmark patch	2019-07-16 15:40:22 +00:00
Jacek Kucharczyk	e85616cec2	JBIG2Decoder implementation (#67 ) * Prepared skeleton and basic component implementations for the jbig2 encoding. * Added Bitset. Implemented Bitmap. * Decoder with old Arithmetic Decoder * Partly working arithmetic * Working arithmetic decoder. * MMR patched. * rebuild to apache. * Working generic * Decoded full document * Decoded AnnexH document * Minor issues fixed. * Update README.md * Fixed generic region errors. Added benchmark. Added bitmap unpadder. Added Bitmap toImage method. * Fixed endofpage error * Added integration test. * Decoded all test files without errors. Implemented JBIG2Global. * Merged with v3 version * Fixed the EOF in the globals issue * Fixed the JBIG2 ChocolateData Decode * JBIG2 Added license information * Minor fix in jbig2 encoding. * Applied the logging convention * Cleaned unnecessary imports * Go modules clear unused imports * checked out the README.md * Moved trace to Debug. Fixed the build integrate tag in the document_decode_test.go * Applied UniPDF Developer Guide. Fixed lint issues. * Cleared documentation, fixed style issues. * Added jbig2 doc.go files. Applied unipdf guide style. * Minor code style changes. * Minor naming and style issues fixes. * Minor naming changes. Style issues fixed. * Review r11 fixes. * Integrate jbig2 tests with build system * Added jbig2 integration test golden files. * Minor jbig2 integration test fix * Removed jbig2 integration image assertions * Fixed jbig2 rowstride issue. Implemented jbig2 bit writer * Changed golden files logic. Fixes r13 issues.	2019-07-14 21:18:40 +00:00
Adrian-George Bostan	d8dcc051b3	Fix annotation flatten when AcroForm does not exist (#93 ) * Fix annotation flatten when AcroForm does not exist. * Adapt test case file hashes to account for file flattening	2019-06-25 19:29:03 +00:00
Gunnsteinn Hall	7a9a8ff542	Add FDF merge test case for form filling and flattening with change detection (#98 ) Manually verified that output PDFs look good and leave hash check to detect change. If there is a change in the future, the hash change will trigger a failure upon which the output PDFs need to be re-checked and hashes updated if appropriate.	2019-06-25 08:08:51 +00:00
Adrian-George Bostan	8425bf7c8f	Update page resources Font dictionary when applying license information (#5 ) * Make PdfObjectDictionary Merge method chainable * Update page resources Font dictionary when applying license information * Add license font to the page resources only when it does not exist * Update hash for split test after verification	2019-05-30 10:52:05 +00:00
Adrian-George Bostan	c64812093d	Remmove pdf folder and move packages up one level (#2 )	2019-05-16 20:44:51 +00:00

18 Commits