unipdf

mirror of https://github.com/unidoc/unipdf.git synced 2025-04-24 13:48:49 +08:00

Author	SHA1	Message	Date
UniDoc Build	d287d85878	prepare release	2024-12-20 06:39:10 +00:00
UniDoc Build	ac4fe09ce0	prepare release	2024-06-27 16:15:49 +00:00
UniDoc Build	51251b1e5f	prepare release	2024-04-16 11:40:43 +00:00
UniDoc Build	006d8524e7	prepare release	2024-03-27 22:34:33 +00:00
UniDoc Build	4cb53dbb7b	prepare release	2024-02-11 21:29:32 +00:00
UniDoc Build	a8fa52b222	prepare release	2024-01-22 01:16:41 +00:00
UniDoc Build	22e9f4bade	prepare release	2023-12-17 13:54:01 +00:00
UniDoc Build	97e47ce77b	prepare release	2023-11-11 11:29:03 +00:00
UniDoc Build	89a1ba3e48	prepare release	2023-10-07 13:58:01 +00:00
UniDoc Build	87c1e49788	prepare release	2023-08-03 17:30:04 +00:00
UniDoc Build	854d57a737	prepare release	2023-04-06 19:57:40 +00:00
UniDoc Build	e274047f4f	prepare release	2022-12-15 21:59:56 +00:00
UniDoc Build	930693130b	prepare release	2022-09-10 15:35:04 +00:00
UniDoc Build	96640edbe3	prepare release	2022-07-13 21:28:43 +00:00
UniDoc Build	ad2a915d0a	prepare release	2022-06-27 19:58:38 +00:00
UniDoc Build	7101928e27	prepare release	2022-04-27 00:10:33 +00:00
UniDoc Build	dfadfc1b51	prepare release	2022-02-05 21:34:53 +00:00
UniDoc Build	100631484f	prepare release	2021-12-14 01:08:28 +00:00
UniDoc Build	804e0287b4	prepare release	2021-10-22 10:53:20 +00:00
UniDoc Build	ec7f5e55c3	prepare release	2021-02-22 02:29:48 +00:00
UniDoc Build	6ec1f6abf1	prepare release	2021-01-07 14:20:10 +00:00
UniDoc Build	79e32364de	prepare release	2020-11-11 18:48:37 +00:00
UniDoc Build	22540b937c	prepare release	2020-10-19 10:58:10 +00:00
UniDoc Build	1501d07a74	prepare release	2020-08-27 21:45:09 +00:00
Peter Williams	88fda44e0a	Text extraction code for columns. (#366 ) * Fixed filename:page in logging * Got CMap working for multi-rune entries * Treat CMap entries as strings instead of runes to handle multi-byte encodings. * Added a test for multibyte encoding. * First version of text extraction that recognizes columns * Added an expanation of the text columns code to README.md. * fixed typos * Abstracted textWord depth calculation. This required change textMark to textMark in a lot of code. Added function comments. * Fixed text state save/restore. * Adjusted inter-word search distance to make paragrah division work for thanh.pdf * Got text_test.go passing. * Reinstated hyphen suppression * Handle more cases of fonts not being set in text extraction code. * Fixed typo * More verbose logging * Adding tables to text extractor. * Added tests for columns extraction. * Removed commented code * Check for textParas that are on the same line when writing out extracted text. * Absorb text to the left of paras into paras e.g. Footnote numbers * Removed funny character from text_test.go * Commented out a creator_test.go test that was broken by my text extraction changes. * Big changes to columns text extraction code for PR. Performance improvements in several places. Commented code. * Updated extractor/README * Cleaned up some comments and removed a panic * Increased threshold for truncating extracted text when there is no license 100 -> 102. This is a workaround to let a test in creator_test.go pass. With the old text extraction code the following extracted text was 100 chars. With the new code it is 102 chars which looks correct. "你好\n你好你好你好你好\n河上白云\n\nUnlicensed UniDoc - Get a license on https://unidoc.io\n\n" * Improved an error message. * Removed irrelevant spaces * Commented code and removed unused functions. * Reverted PdfRectangle changes * Added duplicate text detection. * Combine diacritic textMarks in text extraction * Reinstated a diacritic recombination test. * Small code reorganisation * Reinstated handling of rotated text * Addressed issues in PR review * Added color fields to TextMark * Updated README * Reinstated the disabled tests I missed before. * Tightened definition for tables to prevent detection of tables where there weren't any. * Compute line splitting search range based on fontsize of first word in word bag. * Use errors.Is(err, core.ErrNotSupported) to distinguish unsupported font errorrs. See https://blog.golang.org/go1.13-errors * Fixed some naming and added some comments. * errors.Is -> xerrors.Is and %w -> %v for go 1.12 compatibility * Removed code that doesn't ever get called. * Removed unused test	2020-06-30 19:33:10 +00:00
Gunnsteinn Hall	1b1158ed94	Merge remote-tracking branch 'upstream/master' into dev-merge-master	2020-06-16 21:45:48 +00:00
Gunnsteinn Hall	11f692bc3a	Font subsetting and font optimization improvements (#362 ) * Track runes in IdentityEncoder (for subsetting), track decoded runes * Working with the identity encoder in font_composite.go * Add GetFilterArray to multi encoder. Add comments. * Add NewFromContents constructor to extractor only requiring contents and resources * golint fixes * Optimizer compress streams - improved detection of raw streams * Optimize - CleanContentStream optimizer that removes redundant operands * WIP Optimize - clean fonts Will support both font file reduction and subsetting. (WIP) * Optimize - image processing - try combined DCT and Flate * Update options.go * Update optimizer.go * Create utils.go for optimize with common methods needed for optimization * Optimizer - add font subsetting method Covers XObject Forms, annotaitons etc. Uses extractor package to extract text marks covering what fonts and glyphs are used. Package truetype used for subsetting. * Add some comments * Fix cmap parsing rune conversion * Error checking for extractor. Add some comments. * Update Jenkinsfile * Update modules	2020-06-16 21:19:10 +00:00
Gunnsteinn Hall	e8d29245a2	Prepare release v3.7.1	2020-05-25 23:07:17 +00:00
Gunnsteinn Hall	ad2a1e9c9d	Subsetting fixes (#346 ) * Update unitype lib which improves subsetting * Add text extraction check to creator font subsetting example Helps ensure ToUnicode map is set correctly. * Clean up import * Fix spelling	2020-05-12 07:15:09 +00:00
Gunnsteinn Hall	9ef2f27694	Support for subsetting fonts (#335 ) * Subsetting of TrueType CID fonts using unitype * Simplify call to SubsetRegistered so can be done right after loading font via creator finalizer * Add an EnableFontSubsetting function on the creator to simplify font subsetting for creator users	2020-05-05 00:17:27 +00:00
Alexey Pavlyukov	a69d788171	Add timestamp signature handler (#301 ) * Add timestamp signature handler * Add timestamp signature handler test * fix PR issues * fix PR issues * fix PR issues * Fix Co-authored-by: Gunnsteinn Hall <gunnsteinn.hall@gmail.com>	2020-04-22 20:21:53 +00:00
Jacek Kucharczyk	29efa30439	JBIG2 Encoder support for inserting binary images into PDF (#288 ) * Added JBIG2 PDF support * Added JBIG2 Encoder binary image requirements * PR #288 revision r1 fixes * PR #288 revision r2 fixes	2020-04-03 20:54:59 +00:00
Adrian-George Bostan	d961079c5d	Add basic image rendering support (#266 ) * Add render package * Add text state * Add more text operators * Remove unnecessary files * Add text font * Add custom text render method * Improve text rendering method * Rename text state methods * Refactor and document context interface * Refact text begin/end operators * Fix graphics state transformations * Keep original font when doing font substitution * Take page cropbox into account * Revert to substitution font if original font measurement is 0 * Add font substitution package * Implement addition transform.Point methods * Use transform.Point in the image context package * Remove unneeded functionality from the render image package * Fix golint notices in the image rendering package * Fix go vet notices in the render package * Fix golint notices in the top-level render package * Improve render context package documentation * Document context text state struct. * Document context text font struct. * Minor logging improvements * Add license disclaimer to the render package files * Avoid using package aliases where possible * Change style of section comments * Adapt render package import style to follow the developer guide * Improve documentation for the internal matrix implementation * Update render package dependency versions * Apply crop box post render * Account for offseted media boxes * Improve metrics of rendered characters * Fix text matrix translation * Change priority of fonts used for measuring rendered characters * Skip invalid m and l operators on image rendering * Small fix for v operator * Fix rendered characters spacing issues * Refactor naming of internal render packages	2020-03-02 21:22:54 +00:00
Jacek Kucharczyk	e85616cec2	JBIG2Decoder implementation (#67 ) * Prepared skeleton and basic component implementations for the jbig2 encoding. * Added Bitset. Implemented Bitmap. * Decoder with old Arithmetic Decoder * Partly working arithmetic * Working arithmetic decoder. * MMR patched. * rebuild to apache. * Working generic * Decoded full document * Decoded AnnexH document * Minor issues fixed. * Update README.md * Fixed generic region errors. Added benchmark. Added bitmap unpadder. Added Bitmap toImage method. * Fixed endofpage error * Added integration test. * Decoded all test files without errors. Implemented JBIG2Global. * Merged with v3 version * Fixed the EOF in the globals issue * Fixed the JBIG2 ChocolateData Decode * JBIG2 Added license information * Minor fix in jbig2 encoding. * Applied the logging convention * Cleaned unnecessary imports * Go modules clear unused imports * checked out the README.md * Moved trace to Debug. Fixed the build integrate tag in the document_decode_test.go * Applied UniPDF Developer Guide. Fixed lint issues. * Cleared documentation, fixed style issues. * Added jbig2 doc.go files. Applied unipdf guide style. * Minor code style changes. * Minor naming and style issues fixes. * Minor naming changes. Style issues fixed. * Review r11 fixes. * Integrate jbig2 tests with build system * Added jbig2 integration test golden files. * Minor jbig2 integration test fix * Removed jbig2 integration image assertions * Fixed jbig2 rowstride issue. Implemented jbig2 bit writer * Changed golden files logic. Fixes r13 issues.	2019-07-14 21:18:40 +00:00
Gunnsteinn Hall	3d22e17a91	Prepare release of v3.0.0-alpha.3	2019-03-28 17:19:39 +00:00
Gunnsteinn Hall	323dc5394f	release v3.0.0-alpha.2	2019-02-07 12:17:54 +00:00
Denys Smirnov	622ae5668d	textencoding: generate table for WinAnsi encoding from CP1252	2019-01-01 17:20:01 +02:00
Denys Smirnov	41af4a14eb	list dependencies for dep and go modules	2018-11-29 01:15:19 +02:00

38 Commits