this post was submitted on 25 Jun 2023
30 points (100.0% liked)

Asklemmy

43858 readers
1674 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy πŸ”

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_[email protected]~

founded 5 years ago
MODERATORS
 

Title. I got a hold of a couple books (one text, other,images) that I would like to make it available to others and the don't exists in digital format. Photos to PDF? Or something that converts IMG to text?

top 21 comments
sorted by: hot top controversial new old
[–] [email protected] 9 points 1 year ago

First you need to get images of them. Better quality images make the ocr step better. Then image to text.

Make a frame to hold the camera at the right height above the page. Good lighting.

It's not quick unless you have the hardware.

Another option is to send the book to a place that will scan it for you. Google for options.

[–] [email protected] 6 points 1 year ago* (last edited 1 year ago)

I don't know how you're going to get a hold of the text from the images. But I do know that if you're trying to create a book file, PDFs are not the answer. EPUBs are far better, and an open standard. I recommend creating them using the Calibre EPUB editor.

The reason EPUBs are better is because they were designed specifically for books. They're reflowable (meaning the pages aren't fixed-size, and therefore can be read on devices of all sizes), whereas PDFs have fixed content, and are very difficult to read on small things like phones and e-readers, requiring zooming just to see the text. Also, EPUBs aren't very difficult to create. You just have to know how XML works. It's basically just a zipped directory containing markup files.

[–] [email protected] 6 points 1 year ago* (last edited 1 year ago) (1 children)

Our university library has a high performance book scanner. Maybe yours has one too. It costs like 1 Cent per 10 Pages, but it's so fast it might be worth checking out if yours has one too. You can extract then the text from the images with tesseract. There are some ready tools that will do this for you to make the PDF searchable

[–] [email protected] 2 points 1 year ago (1 children)

How does it flip between pages?

[–] [email protected] 4 points 1 year ago (1 children)

You do by hand, but it goes very fast. It's like a table, you lay it open (text upside), it scans, you flip it scans. You don't have to open/close anything. Just flip like every second. So for a book with 120 pages you would need like 60 seconds

[–] [email protected] 3 points 1 year ago (1 children)

Usually I think of scanner like a photocopier, this sounds more like it takes a picture?

[–] [email protected] 5 points 1 year ago (1 children)

Well a photocopier takes a picture too πŸ˜‰

[–] [email protected] 1 points 1 year ago (2 children)

You know what I mean. So it's more of a camera than a scanner?

[–] [email protected] 2 points 1 year ago

Well it still has this Bright light like advancing. But the sensor is like 1m above the table. Honestly I don't know if this makes it more a camera or a scanner

[–] [email protected] 1 points 1 year ago

My old uni library has 2 of those too. It's basically a fixed camera, but the table you set the book on, has a sort of negative nook that lets you level out the book. So no matter on which page you are, the pages on the book will be on the same level.

[–] [email protected] 5 points 1 year ago

When I wanted to quickly scan two books to my kindle, I used vFlat. It wanted money but somehow I was able to scan both books without paying anything. I put my phone so that it can see the double page from top and then set the app to take a picture every X seconds (about 5 I think). The I just flipped a page and it took a picture of the double page, created two PDF pages from it, fixed aligning and perspective, removed fingers from corners and so on. Pretty good experience. I tried several FOSS alternatives before, but sadly none was as good as this.

[–] [email protected] 4 points 1 year ago (1 children)

there are book scanning services out there there are book scanning services out there

[–] [email protected] 9 points 1 year ago (1 children)

There are also book scanning services out there

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago) (1 children)

There are also book scanning services out there

wtf, it was only one when i clicked send lol

[–] [email protected] 5 points 1 year ago (1 children)

I heard there are book scanning services out there

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago)

Funnily enough I have heard there are book scanning services out there

[–] [email protected] 3 points 1 year ago (1 children)

If the books aren't too obscure, you might just be able to find an EPUB of them online. It's sort of a moral grey-area, but considering you already own the books I assume, you can very likely find them here.

[–] [email protected] 1 points 1 year ago

I don't own the books, they are from the library and they do not have digital versions available. I would like them more portable for me and to share with others who can't access or afford. I get a lot from zlibrary so I'd like to contribute when I can too.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

I converted several books to PDF using 1dollarscan. I think OCR was an option, but I just split the PDF and used linux tools to OCR the resulting files that I wanted OCRed. > I don’t own the books,

Edited to add:

I don’t own the books,

Oh. the scanning service above is destructive

[–] [email protected] 1 points 1 year ago

I use the microsoft office lense for this kind of stuff. You take photos that is alligned. It can be saved as pdf and loads of other formats.

[–] [email protected] 1 points 1 year ago

There are some great guides at https://www.diybookscanner.org/ that you can check out. Some of them are a bit outdated so I would recommend software like tesseract and phone cameras should be good enough for general use

load more comments
view more: next β€Ί