263 lines
5.4 KiB
Go
Raw Normal View History

JBIG2Decoder implementation (#67) * Prepared skeleton and basic component implementations for the jbig2 encoding. * Added Bitset. Implemented Bitmap. * Decoder with old Arithmetic Decoder * Partly working arithmetic * Working arithmetic decoder. * MMR patched. * rebuild to apache. * Working generic * Decoded full document * Decoded AnnexH document * Minor issues fixed. * Update README.md * Fixed generic region errors. Added benchmark. Added bitmap unpadder. Added Bitmap toImage method. * Fixed endofpage error * Added integration test. * Decoded all test files without errors. Implemented JBIG2Global. * Merged with v3 version * Fixed the EOF in the globals issue * Fixed the JBIG2 ChocolateData Decode * JBIG2 Added license information * Minor fix in jbig2 encoding. * Applied the logging convention * Cleaned unnecessary imports * Go modules clear unused imports * checked out the README.md * Moved trace to Debug. Fixed the build integrate tag in the document_decode_test.go * Applied UniPDF Developer Guide. Fixed lint issues. * Cleared documentation, fixed style issues. * Added jbig2 doc.go files. Applied unipdf guide style. * Minor code style changes. * Minor naming and style issues fixes. * Minor naming changes. Style issues fixed. * Review r11 fixes. * Integrate jbig2 tests with build system * Added jbig2 integration test golden files. * Minor jbig2 integration test fix * Removed jbig2 integration image assertions * Fixed jbig2 rowstride issue. Implemented jbig2 bit writer * Changed golden files logic. Fixes r13 issues.
2019-07-14 23:18:40 +02:00
/*
* This file is subject to the terms and conditions defined in
* file 'LICENSE.md', which is part of this source code package.
*/
package reader
import (
"encoding/binary"
"errors"
"io"
"github.com/unidoc/unipdf/v3/common"
)
// Reader is the bit reader implementation.
// Implements io.Reader, io.ByteReader, io.Seeker interfaces.
type Reader struct {
in []byte
cache byte // unread bits are stored here
bits byte // number of unread bits in cache
JBIG2 Generic Encoder (#264) * Prepared skeleton and basic component implementations for the jbig2 encoding. * Added Bitset. Implemented Bitmap. * Decoder with old Arithmetic Decoder * Partly working arithmetic * Working arithmetic decoder. * MMR patched. * rebuild to apache. * Working generic * Working generic * Decoded full document * Update Jenkinsfile go version [master] (#398) * Update Jenkinsfile go version * Decoded AnnexH document * Minor issues fixed. * Update README.md * Fixed generic region errors. Added benchmark. Added bitmap unpadder. Added Bitmap toImage method. * Fixed endofpage error * Added integration test. * Decoded all test files without errors. Implemented JBIG2Global. * Merged with v3 version * Fixed the EOF in the globals issue * Fixed the JBIG2 ChocolateData Decode * JBIG2 Added license information * Minor fix in jbig2 encoding. * Applied the logging convention * Cleaned unnecessary imports * Go modules clear unused imports * checked out the README.md * Moved trace to Debug. Fixed the build integrate tag in the document_decode_test.go * Initial encoder skeleton * Applied UniPDF Developer Guide. Fixed lint issues. * Cleared documentation, fixed style issues. * Added jbig2 doc.go files. Applied unipdf guide style. * Minor code style changes. * Minor naming and style issues fixes. * Minor naming changes. Style issues fixed. * Review r11 fixes. * Added JBIG2 Encoder skeleton. * Moved Document and Page to jbig2/document package. Created decoder package responsible for decoding jbig2 stream. * Implemented raster functions. * Added raster uni low test funcitons. * Added raster low test functions * untracked files on jbig2-encoder: c869089 Added raster low test functions * index on jbig2-encoder: c869089 Added raster low test functions * Added morph files. * implemented jbig2 encoder basics * JBIG2 Encoder - Generic method * Added jbig2 image encode ttests, black/white image tests * cleaned and tested jbig2 package * unfinished jbig2 classified encoder * jbig2 minor style changes * minor jbig2 encoder changes * prepared JBIG2 Encoder * Style and lint fixes * Minor changes and lints * Fixed shift unsinged value build errors * Minor naming change * Added jbig2 encode, image gondels. Fixed jbig2 decode bug. * Provided jbig2 core.DecodeGlobals function. * Fixed JBIG2Encoder `r6` revision issues. * Removed public JBIG2Encoder document. * Minor style changes * added NewJBIG2Encoder function. * fixed JBIG2Encoder 'r9' revision issues. * Cleared 'r9' commented code. * Updated ACKNOWLEDGEMENETS. Fixed JBIG2Encoder 'r10' revision issues. Co-authored-by: Gunnsteinn Hall <gunnsteinn.hall@gmail.com>
2020-03-27 12:47:41 +01:00
r int64 // buf read positions
JBIG2Decoder implementation (#67) * Prepared skeleton and basic component implementations for the jbig2 encoding. * Added Bitset. Implemented Bitmap. * Decoder with old Arithmetic Decoder * Partly working arithmetic * Working arithmetic decoder. * MMR patched. * rebuild to apache. * Working generic * Decoded full document * Decoded AnnexH document * Minor issues fixed. * Update README.md * Fixed generic region errors. Added benchmark. Added bitmap unpadder. Added Bitmap toImage method. * Fixed endofpage error * Added integration test. * Decoded all test files without errors. Implemented JBIG2Global. * Merged with v3 version * Fixed the EOF in the globals issue * Fixed the JBIG2 ChocolateData Decode * JBIG2 Added license information * Minor fix in jbig2 encoding. * Applied the logging convention * Cleaned unnecessary imports * Go modules clear unused imports * checked out the README.md * Moved trace to Debug. Fixed the build integrate tag in the document_decode_test.go * Applied UniPDF Developer Guide. Fixed lint issues. * Cleared documentation, fixed style issues. * Added jbig2 doc.go files. Applied unipdf guide style. * Minor code style changes. * Minor naming and style issues fixes. * Minor naming changes. Style issues fixed. * Review r11 fixes. * Integrate jbig2 tests with build system * Added jbig2 integration test golden files. * Minor jbig2 integration test fix * Removed jbig2 integration image assertions * Fixed jbig2 rowstride issue. Implemented jbig2 bit writer * Changed golden files logic. Fixes r13 issues.
2019-07-14 23:18:40 +02:00
lastByte int
lastRuneSize int
mark int64
markBits byte
}
// compile time checks for the interface implementation of the Reader.
var (
_ io.Reader = &Reader{}
_ io.ByteReader = &Reader{}
_ io.Seeker = &Reader{}
_ StreamReader = &Reader{}
)
// New creates a new reader.Reader using the byte slice data as input.
func New(data []byte) *Reader {
return &Reader{in: data}
}
// Align implements StreamReader interface.
func (r *Reader) Align() (skipped byte) {
skipped = r.bits
r.bits = 0 // no need to clear cache - it would be overwritten on next read.
return skipped
}
// ConsumeRemainingBits consumes the remaining bits from the given reader.
func (r *Reader) ConsumeRemainingBits() {
if r.bits != 0 {
_, err := r.ReadBits(r.bits)
if err != nil {
common.Log.Debug("ConsumeRemainigBits failed: %v", err)
}
}
}
// BitPosition implements StreamReader inteface.
func (r *Reader) BitPosition() int {
return int(r.bits)
}
// Length implements StreamReader interface.
func (r *Reader) Length() uint64 {
return uint64(len(r.in))
}
// Mark implements StreamReader interface.
func (r *Reader) Mark() {
r.mark = r.r
r.markBits = r.bits
}
// Read implements io.Reader interface.
func (r *Reader) Read(p []byte) (n int, err error) {
if r.bits == 0 {
return r.read(p)
}
for ; n < len(p); n++ {
if p[n], err = r.readUnalignedByte(); err != nil {
return 0, err
}
}
return n, nil
}
// ReadBit implements StreamReader interface.
func (r *Reader) ReadBit() (bit int, err error) {
boolean, err := r.readBool()
if err != nil {
return 0, err
}
if boolean {
bit = 1
}
return bit, nil
}
// ReadBits implements StreamReader interface.
func (r *Reader) ReadBits(n byte) (u uint64, err error) {
// Frequent optimization.
if n < r.bits {
// cache has all needed bits, there are also some extra which will be left in cache.
shift := r.bits - n
u = uint64(r.cache >> shift)
r.cache &= 1<<shift - 1
r.bits = shift
return u, nil
}
if n > r.bits {
if r.bits > 0 {
u = uint64(r.cache)
n -= r.bits
}
// Read whole bytes.
for n >= 8 {
b, err := r.readBufferByte()
if err != nil {
return 0, err
}
u = u<<8 + uint64(b)
n -= 8
}
// Read last fraction if exists.
if n > 0 {
if r.cache, err = r.readBufferByte(); err != nil {
return 0, err
}
shift := 8 - n
u = u<<n + uint64(r.cache>>shift)
r.cache &= 1<<shift - 1
r.bits = shift
} else {
r.bits = 0
}
return u, nil
}
r.bits = 0 // no need to clear cache, will be overridden on next read
return uint64(r.cache), nil
}
// ReadBool implements StreamReader interface.
func (r *Reader) ReadBool() (bool, error) {
return r.readBool()
}
// ReadByte implements io.ByteReader.
func (r *Reader) ReadByte() (byte, error) {
// r.bits will be the same after reading 8 bits, so we don't need to update that.
if r.bits == 0 {
return r.readBufferByte()
}
return r.readUnalignedByte()
}
JBIG2 Generic Encoder (#264) * Prepared skeleton and basic component implementations for the jbig2 encoding. * Added Bitset. Implemented Bitmap. * Decoder with old Arithmetic Decoder * Partly working arithmetic * Working arithmetic decoder. * MMR patched. * rebuild to apache. * Working generic * Working generic * Decoded full document * Update Jenkinsfile go version [master] (#398) * Update Jenkinsfile go version * Decoded AnnexH document * Minor issues fixed. * Update README.md * Fixed generic region errors. Added benchmark. Added bitmap unpadder. Added Bitmap toImage method. * Fixed endofpage error * Added integration test. * Decoded all test files without errors. Implemented JBIG2Global. * Merged with v3 version * Fixed the EOF in the globals issue * Fixed the JBIG2 ChocolateData Decode * JBIG2 Added license information * Minor fix in jbig2 encoding. * Applied the logging convention * Cleaned unnecessary imports * Go modules clear unused imports * checked out the README.md * Moved trace to Debug. Fixed the build integrate tag in the document_decode_test.go * Initial encoder skeleton * Applied UniPDF Developer Guide. Fixed lint issues. * Cleared documentation, fixed style issues. * Added jbig2 doc.go files. Applied unipdf guide style. * Minor code style changes. * Minor naming and style issues fixes. * Minor naming changes. Style issues fixed. * Review r11 fixes. * Added JBIG2 Encoder skeleton. * Moved Document and Page to jbig2/document package. Created decoder package responsible for decoding jbig2 stream. * Implemented raster functions. * Added raster uni low test funcitons. * Added raster low test functions * untracked files on jbig2-encoder: c869089 Added raster low test functions * index on jbig2-encoder: c869089 Added raster low test functions * Added morph files. * implemented jbig2 encoder basics * JBIG2 Encoder - Generic method * Added jbig2 image encode ttests, black/white image tests * cleaned and tested jbig2 package * unfinished jbig2 classified encoder * jbig2 minor style changes * minor jbig2 encoder changes * prepared JBIG2 Encoder * Style and lint fixes * Minor changes and lints * Fixed shift unsinged value build errors * Minor naming change * Added jbig2 encode, image gondels. Fixed jbig2 decode bug. * Provided jbig2 core.DecodeGlobals function. * Fixed JBIG2Encoder `r6` revision issues. * Removed public JBIG2Encoder document. * Minor style changes * added NewJBIG2Encoder function. * fixed JBIG2Encoder 'r9' revision issues. * Cleared 'r9' commented code. * Updated ACKNOWLEDGEMENETS. Fixed JBIG2Encoder 'r10' revision issues. Co-authored-by: Gunnsteinn Hall <gunnsteinn.hall@gmail.com>
2020-03-27 12:47:41 +01:00
// ReadUint32 implements StreamReader interface.
func (r *Reader) ReadUint32() (uint32, error) {
JBIG2Decoder implementation (#67) * Prepared skeleton and basic component implementations for the jbig2 encoding. * Added Bitset. Implemented Bitmap. * Decoder with old Arithmetic Decoder * Partly working arithmetic * Working arithmetic decoder. * MMR patched. * rebuild to apache. * Working generic * Decoded full document * Decoded AnnexH document * Minor issues fixed. * Update README.md * Fixed generic region errors. Added benchmark. Added bitmap unpadder. Added Bitmap toImage method. * Fixed endofpage error * Added integration test. * Decoded all test files without errors. Implemented JBIG2Global. * Merged with v3 version * Fixed the EOF in the globals issue * Fixed the JBIG2 ChocolateData Decode * JBIG2 Added license information * Minor fix in jbig2 encoding. * Applied the logging convention * Cleaned unnecessary imports * Go modules clear unused imports * checked out the README.md * Moved trace to Debug. Fixed the build integrate tag in the document_decode_test.go * Applied UniPDF Developer Guide. Fixed lint issues. * Cleared documentation, fixed style issues. * Added jbig2 doc.go files. Applied unipdf guide style. * Minor code style changes. * Minor naming and style issues fixes. * Minor naming changes. Style issues fixed. * Review r11 fixes. * Integrate jbig2 tests with build system * Added jbig2 integration test golden files. * Minor jbig2 integration test fix * Removed jbig2 integration image assertions * Fixed jbig2 rowstride issue. Implemented jbig2 bit writer * Changed golden files logic. Fixes r13 issues.
2019-07-14 23:18:40 +02:00
ub := make([]byte, 4)
_, err := r.Read(ub)
if err != nil {
return 0, err
}
return binary.BigEndian.Uint32(ub), nil
}
// Reset implements StreamReader interface.
func (r *Reader) Reset() {
r.r = r.mark
r.bits = r.markBits
}
// Seek implements the io.Seeker interface.
func (r *Reader) Seek(offset int64, whence int) (int64, error) {
r.lastRuneSize = -1
var abs int64
switch whence {
case io.SeekStart:
abs = offset
case io.SeekCurrent:
abs = r.r + offset
case io.SeekEnd:
abs = int64(len(r.in)) + offset
default:
return 0, errors.New("reader.Reader.Seek: invalid whence")
}
if abs < 0 {
return 0, errors.New("reader.Reader.Seek: negative position")
}
r.r = abs
r.bits = 0
return abs, nil
}
// StreamPosition implements StreamReader interface.
func (r *Reader) StreamPosition() int64 {
return r.r
}
func (r *Reader) read(p []byte) (int, error) {
if r.r >= int64(len(r.in)) {
return 0, io.EOF
}
r.lastRuneSize = -1
n := copy(p, r.in[r.r:])
r.r += int64(n)
return n, nil
}
func (r *Reader) readBufferByte() (byte, error) {
if r.r >= int64(len(r.in)) {
return 0, io.EOF
}
r.lastRuneSize = -1
c := r.in[r.r]
r.r++
r.lastByte = int(c)
return c, nil
}
// readUnalignedByte reads the next 8 bits which are (may be) unaligned and returns them as a byte.
func (r *Reader) readUnalignedByte() (b byte, err error) {
// r.bits will be the same after reading 8 bits, so we don't need to update that.
bits := r.bits
b = r.cache << (8 - bits)
r.cache, err = r.readBufferByte()
if err != nil {
return 0, err
}
b |= r.cache >> bits
r.cache &= 1<<bits - 1
return b, nil
}
func (r *Reader) readBool() (bit bool, err error) {
if r.bits == 0 {
r.cache, err = r.readBufferByte()
if err != nil {
return false, err
}
bit = (r.cache & 0x80) != 0
r.cache, r.bits = r.cache&0x7f, 7
return bit, nil
}
r.bits--
bit = (r.cache & (1 << r.bits)) != 0
r.cache &= 1<<r.bits - 1
return bit, nil
}