Adrian-George Bostan
|
9b6cd8b88a
|
Start at the top of the page when table block is created on new page
|
2019-01-07 19:22:08 +02:00 |
|
Gunnsteinn Hall
|
1085bf1a2e
|
Merge branch 'v3' into patch-1
|
2019-01-07 16:43:50 +00:00 |
|
Denys Smirnov
|
bf2afd3409
|
textencoding: merge differences if applied twice
|
2019-01-06 19:20:35 +02:00 |
|
Emir Ribić
|
fb50cd2c0d
|
Update invoice_test.go
|
2019-01-06 16:50:22 +01:00 |
|
Denys Smirnov
|
e740aba6c5
|
textencoding: fix a PDF output for simple encodings; fix #293
|
2019-01-06 13:51:00 +02:00 |
|
Gunnsteinn Hall
|
bf47fc5b6e
|
Merge branch 'v3' into encodings
|
2019-01-05 17:18:16 +00:00 |
|
Denys Smirnov
|
4a376ec651
|
textencoding: define WinAnsi directly instead of using CP1252
|
2019-01-05 18:32:53 +02:00 |
|
Peter Williams
|
72c7fd37d0
|
(*pageText). -> pageText.
|
2019-01-05 14:10:54 +11:00 |
|
Peter Williams
|
6b1764c118
|
(*pt). -> pt.
|
2019-01-05 09:14:10 +11:00 |
|
Peter Williams
|
4aa7e5051e
|
Changes missed in previous commit.
|
2019-01-04 16:07:03 +11:00 |
|
Peter Williams
|
e251b6b2f2
|
Made TextList an opaque struct and renamed it to PageText to reflect its purpose rather than its current implementation.
|
2019-01-04 16:02:22 +11:00 |
|
Peter Williams
|
4cb130c31f
|
Fixed some typos.
|
2019-01-03 15:41:36 +11:00 |
|
Peter Williams
|
a493fce496
|
Merge branch 'v3' of https://github.com/unidoc/unidoc into text.fixes
|
2019-01-03 15:16:38 +11:00 |
|
Denys Smirnov
|
aeea76f4dd
|
fonts: read ttf font data once
|
2019-01-02 17:18:43 +02:00 |
|
Denys Smirnov
|
0fe2f0a27a
|
textencoding: alias x/text/transform import to avoid confusion
|
2019-01-02 17:03:03 +02:00 |
|
Denys Smirnov
|
203b620067
|
textencoding: init other encodings once and reformat tables
|
2019-01-02 16:54:37 +02:00 |
|
Peter Williams
|
2f2b5c6ec1
|
Made many fields text.go private.
|
2019-01-02 10:39:30 +11:00 |
|
Denys Smirnov
|
0327d18eb6
|
textencoding: remove all unrelated methods from the interface
|
2019-01-01 23:24:11 +02:00 |
|
Denys Smirnov
|
7a2cd35f48
|
fonts: rebuild font metrics tables based on runes for standard fonts
|
2019-01-01 22:40:11 +02:00 |
|
Denys Smirnov
|
2e820f3ac5
|
textencoding: remove unused rune <-> glyph methods from the interface
|
2019-01-01 22:15:22 +02:00 |
|
Denys Smirnov
|
1742cb9c89
|
textencoding: drop old simpleEncoder, use the new implementation
|
2019-01-01 21:17:57 +02:00 |
|
Denys Smirnov
|
3c5fc18b01
|
textencoding: refactor encodings; better handling for differences
|
2019-01-01 17:20:01 +02:00 |
|
Denys Smirnov
|
622ae5668d
|
textencoding: generate table for WinAnsi encoding from CP1252
|
2019-01-01 17:20:01 +02:00 |
|
Denys Smirnov
|
ac7696693b
|
fonts: describe few issues with the code; remove unused cmap type
|
2019-01-01 17:19:58 +02:00 |
|
Peter Williams
|
57e6b41ef1
|
Merge branch 'v3' of https://github.com/unidoc/unidoc into text.fixes
|
2019-01-01 17:34:04 +11:00 |
|
Peter Williams
|
aaf47e1479
|
Font reading code return partial font info for unsupported fonts.
This allows calling code to check font types which is useful for giving information about PDF files.
|
2019-01-01 17:29:49 +11:00 |
|
Peter Williams
|
ca2b73bd7a
|
Removed combineDiacritics from text extraction because it was causing ' and " to be combined with the letters proceeding them.
Need to fix this and reinstate combineDiacritics.
|
2019-01-01 12:22:39 +11:00 |
|
Denys Smirnov
|
83d8086657
|
model: reformat TODOs
|
2018-12-28 16:48:38 +02:00 |
|
Denys Smirnov
|
f6506204d7
|
fonts: simplify code by getting width of runes in font instead of glyphs
|
2018-12-28 01:38:48 +02:00 |
|
Denys Smirnov
|
107718c711
|
fonts: comment about Wy font metric
|
2018-12-28 01:08:50 +02:00 |
|
Denys Smirnov
|
eb04b2d594
|
fonts: remove unused name field in char metrics
|
2018-12-28 01:08:47 +02:00 |
|
Denys Smirnov
|
87ebf6af8f
|
creator: don't use fmt if not needed
|
2018-12-28 01:03:15 +02:00 |
|
Gunnsteinn Hall
|
99a19b0b8d
|
remove duplicate log
|
2018-12-27 17:42:12 +00:00 |
|
Gunnsteinn Hall
|
8f031e7bdb
|
remove panic in extractor
|
2018-12-27 17:18:52 +00:00 |
|
Denys Smirnov
|
dbbef4fd05
|
Merge remote-tracking branch 'peterwilliams97/extract.text' into extract.text
# Conflicts:
# pdf/extractor/text.go
|
2018-12-27 12:40:55 +02:00 |
|
Denys Smirnov
|
8835230856
|
model: fix tests after the merge
|
2018-12-27 12:37:32 +02:00 |
|
Peter Williams
|
c70b66a00d
|
Fixed incorrectly named variable.
|
2018-12-27 21:33:31 +11:00 |
|
Denys Smirnov
|
53687f854e
|
Merge remote-tracking branch 'origin/v3' into extract.text
# Conflicts:
# pdf/contentstream/processor.go
# pdf/extractor/text.go
# pdf/extractor/utils.go
# pdf/internal/textencoding/winansi.go
# pdf/model/font.go
# pdf/model/font_composite.go
# pdf/model/font_simple.go
# pdf/model/font_test.go
# pdf/model/fontfile.go
# pdf/model/fonts/ttfparser.go
# pdf/model/structures.go
|
2018-12-27 12:17:28 +02:00 |
|
Peter Williams
|
2fe54a4269
|
Merge branch 'extract.text' of https://github.com/peterwilliams97/unidoc into extract.text
|
2018-12-27 20:53:59 +11:00 |
|
Peter Williams
|
28957d37b8
|
fixed comment
|
2018-12-27 20:53:37 +11:00 |
|
Peter Williams
|
af99ee41db
|
Recurse through form XObjects for text extractions.
|
2018-12-27 20:51:34 +11:00 |
|
Denys Smirnov
|
e729fa618d
|
model: refactor CharcodesToUnicode to return string and remove TODO
|
2018-12-26 17:11:41 +02:00 |
|
Denys Smirnov
|
db8e50e457
|
model: fix wording in the comments
|
2018-12-19 16:59:13 +05:00 |
|
Denys Smirnov
|
217f984033
|
fonts: make standard font names type-safe
|
2018-12-19 16:55:27 +05:00 |
|
Denys Smirnov
|
85e1a02ac8
|
model: define an unexported pdfFont interface and remove error cases
|
2018-12-19 13:54:45 +05:00 |
|
Denys Smirnov
|
7f667d8fbb
|
model: remove Standard14Font in favor of fonts.StdFont; resolves #269
|
2018-12-19 13:43:09 +05:00 |
|
Denys Smirnov
|
5bf2527b57
|
creator: clarify use of the default encoding and a way to override it
|
2018-12-15 19:39:59 +05:00 |
|
Denys Smirnov
|
e3704defc7
|
rename Typ1 font to StdFont
|
2018-12-15 19:39:55 +05:00 |
|
Denys Smirnov
|
19f95527b8
|
creator: remove SetEncoder from top
|
2018-12-15 18:49:15 +05:00 |
|
Denys Smirnov
|
62420700db
|
fix case typos in errors
|
2018-12-15 18:49:15 +05:00 |
|