Peter Williams 9ebcfcf168 Finding bounding boxes of substrings of extracted text. (#109)
* Added text bounding box extraction.
* Add `font` field to textMark struct;
Create a new method `TextComponents` to retrieve all the text components of the extracted text in the page, with position and character informations
* Reorganizing extractor/text.go
* Added a text extraction position test.
* Added another text extraction location test.
* Text extraction location testing.
* Added tests for text extraction with location information.
* Cleaned up text extraction tests. No changes to functionality.
* Simplifying text extraction code.
* Simplified line construction in text.go
* Returning TextMark's in TextMarkArray which are based on PdfObjectArray but read-only, so not pointers.
* Added text extraction to show PDFs marked-up with bounding boxes of substring in extracted text.
* Add comments explaining how to calculate text bounding boxes.
* Made text_test.go naming consistent with function comments in text.go
* Use tm, pt, tl for textMark/TextMark PageText and TextLine receivers and local variables.
* uncommeted text stress test. Use go test --short to skip
* TextMark.Offset is now an index into the extracted text. It was an index into []rune(text)
2019-07-18 06:41:47 +00:00
2019-07-14 21:18:40 +00:00
2017-06-15 11:03:26 +00:00
2019-01-28 11:32:41 +00:00
2019-05-19 12:41:13 +00:00
2019-05-19 12:41:13 +00:00
2019-07-14 21:18:40 +00:00
2019-07-14 21:18:40 +00:00
2019-06-25 20:03:42 +00:00
2019-05-19 12:41:13 +00:00
2019-05-19 12:41:13 +00:00
2019-05-19 12:41:13 +00:00
2019-06-02 16:21:07 +00:00
2019-05-19 12:41:13 +00:00

UniPDF - PDF for Go

UniDoc's UniPDF (formerly unidoc) is a PDF library for Go (golang) with capabilities for creating and reading, processing PDF files. The library is written and supported by FoxyUtils.com, where the library is used to power many of its services.

Build Status GitHub (pre-)release License: AGPL v3 Go Report Card GoDoc

Features

Multiple examples are provided in our example repository https://github.com/unidoc/unidoc-examples as well as documented examples on our website.

Contact us if you need any specific examples.

News

  • unidoc has been renamed to unipdf and is maintained under https://github.com/unidoc/unipdf
  • The old repository remains under https://github.com/unidoc/unidoc for backwards compatibility and will be read-only. All development is under the unipdf repository.
  • The initial release of unipdf v3.0.0 is compatible with Go modules from the start.

Installation

With modules:

go get github.com/unidoc/unipdf/v3

With GOPATH:

go get github.com/unidoc/unipdf/...

How can I convince myself and my boss to buy unipdf rather using a free alternative?

The choice is yours. There are multiple respectable efforts out there that can do many good things.

In UniDoc, we work hard to provide production quality builds taking every detail into consideration and providing excellent support to our customers. See our testimonials for example.

Security. We take security very seriously and we restrict access to github.com/unidoc/unipdf repository with protected branches and only the founders have access and every commit is reviewed prior to being accepted.

The profits are invested back into making unipdf better. We want to make the best possible product and in order to do that we need the best people to contribute. A large fraction of the profits made goes back into developing unipdf. That way we have been able to get many excellent people to work and contribute to unipdf that would not be able to contribute their work for free.

Contributing

CLA assistant

All contributors must sign a contributor license agreement before their code will be reviewed and merged.

Support and consulting

Please email us at support@unidoc.io for any queries.

If you have any specific tasks that need to be done, we offer consulting in certain cases. Please contact us with a brief summary of what you need and we will get back to you with a quote, if appropriate.

Licensing Information

This library (unipdf) has a dual license, a commercial one suitable for closed source projects and an AGPL license that can be used in open source software.

Depending on your needs, you must choose one of them and follow its policies. A detail of the policies and agreements for each license type are available in the LICENSE.COMMERCIAL and LICENSE.AGPL files.

In brief, purchasing a license is mandatory as soon as you develop activities distributing the unipdf software inside your product or deploying it on a network without disclosing the source code of your own applications under the AGPL license. These activities include:

  • offering services as an application service provider or over-network application programming interface (API)
  • creating/manipulating documents for users in a web/server/cloud application
  • shipping unipdf with a closed source product

Please see pricing to purchase a commercial license or contact sales at sales@unidoc.io for more info.

Getting Rid of the Watermark - Get a License

Out of the box - unipdf is unlicensed and outputs a watermark on all pages, perfect for prototyping. To use unipdf in your projects, you need to get a license.

Get your license on https://unidoc.io.

To load your license, simply do:

unidocLicenseKey := "... your license here ..."
err := license.SetLicenseKey(unidocLicenseKey)
if err != nil {
    fmt.Printf("Error loading license: %v\n", err)
    os.Exit(1)
}
Languages
Go 100%