[-] edie@lemmy.encryptionin.space 7 points 2 hours ago* (last edited 2 hours ago)

It's def. not an OCR error. We'd see other classic OCR errors if it was, 1 I and l are regularly mistaken for each other. I've seen Vol. I become Vol. 1 some amount of times.

Also if it was an OCR error, then... would it be a 2? So like she was 29 yo?


This user is suspected of being a cat. Please report any suspicious behavior.

[-] edie@lemmy.encryptionin.space 6 points 14 hours ago* (last edited 14 hours ago)

Lmao. From what book?


This user is suspected of being a cat. Please report any suspicious behavior.

[-] edie@lemmy.encryptionin.space 2 points 18 hours ago

Yeah. As the comment I linked to said you can also use guix on top of debian or arch or something, depending on what you need.


This user is suspected of being a cat. Please report any suspicious behavior.

[-] edie@lemmy.encryptionin.space 2 points 18 hours ago

Also see this comment https://hexbear.net/post/7209723/6810611


This user is suspected of being a cat. Please report any suspicious behavior.

[-] edie@lemmy.encryptionin.space 2 points 18 hours ago

Two most important posts, by hello_hello:
https://hexbear.net/post/6283243
https://hexbear.net/post/6447348

And a little bit more info:
https://hexbear.net/post/6633884


This user is suspected of being a cat. Please report any suspicious behavior.

[-] edie@lemmy.encryptionin.space 2 points 18 hours ago* (last edited 18 hours ago)

I've run arch before, and I've come to the conclusion I actually want something that is more "handholdy," I don't want to have to be the one to figure out what changes I need to do whenever I update. I quite like NixOS, but... [gestures at the last few months of posts about it], so maybe you want to checkout Guix instead of getting into Nix right now.


This user is suspected of being a cat. Please report any suspicious behavior.

Listen to АТАС -> The song ends -> Click on the YT icon to get to the frontpage -> It recommends АТАС -> Click on АТАС -> Repeat.


This user is suspected of being a cat. Please report any suspicious behavior.

rage-cry nooooo you have to use arch, you have to install gentoo, you have to do LFS. You have to use the terminal for everything.

floppy-owl Mint is working just fine for me.


This user is suspected of being a cat. Please report any suspicious behavior.

https://www.ynetnews.com/business/article/skeqnl08wg

Second paragraph, just below where the image cuts off:

In a response letter attached to the lawsuit, the Chinese fund said that since the outbreak of the war in Israel, Beijing has classified Israel as a “high-risk area” and imposed a ban on any new Chinese investments in the country, making it impossible to carry out the option.


This user is suspected of being a cat. Please report any suspicious behavior.

17

The admins are arguing that because February is a shorter month, I get less cable than usual. What the heck! The agreement was I could eat 0.5 meter of cable per month!


This user is suspected of being a cat. Please report any suspicious behavior.

[-] edie@lemmy.encryptionin.space 7 points 2 days ago

Wasn't me either. I'm saving my monthly allowance of cable for later.


This user is suspected of being a cat. Please report any suspicious behavior.

[-] edie@lemmy.encryptionin.space 9 points 2 days ago

Support Taiwan's return to the UN.

Taiwan was never in the UN. The Republic of China, representing China, was. And was replaced by the PRC representing China.


This user is suspected of being a cat. Please report any suspicious behavior.

[-] edie@lemmy.encryptionin.space 15 points 3 days ago* (last edited 3 days ago)

@Magic8Ball@hexbear.net

Edit: shit i think the bot isn't working


This user is suspected of being a cat. Please report any suspicious behavior.

29
submitted 4 days ago* (last edited 3 days ago) by edie@lemmy.encryptionin.space to c/book_requests@hexbear.net

The most time consuming part of making an ebook for me, has been transcribing the text. OCR certainly helps immensely, but there are still errors and problems. If you have some time and want to help you can transcribe PDFs into HTML.

Step 1: Find a PDF

The first step is to look for the best possible PDF.
A "true text" PDF (as opposed to an image with OCR-text PDF) is the best case scenario, as they can be transformed into an EPUB using Calibre.^[I will enable the "Do not split on page breaks" and set "Split files larger than" to 0/Disable in EPUB Output, as otherwise it can split in weird places] You will likely still need to find a scan of the book to compare to.
If you can’t find a "true text" PDF, look for the highest quality scan. If the PDF doesn’t contain OCR, or it’s of bad quality, you’ll have to do OCR yourself. I usually use ocrmypdf to do so, although I haven’t gotten Calibre to create an EPUB from the outputted PDF with OCR, and instead used ocrmypdf for it’s --sidecar option which gives me plain text.^[In this ocrmypdf is essentially just a front-end for tesseract (the actual OCR program) because tesseract can’t take PDFs as input] You can send me a message with a link to the PDF asking to do the OCR or EPUB-ification and I’ll send back the result. If you don’t need my help, do still send me a message saying you intend to transcribe the book, and do include links.

Step 2: Transcribe

Once you have your EPUB or plain text, you’ll start on the actual "transcription."
The EPUB can be opened in Calibre’s ebook editor (simply click "Edit book" in the UI) or opened with a ZIP manager and the HTML file extracted and edited with your favourite editor. The plain text can be opened in any kind of editor^[Maybe also something like MS Word or LO Writer, although I don’t know if it can export as HTML and how well it will be, I’d rather you use something else], even notepad, but you may want to use a more capable editor like [EDIT: notepad++ and VSCode are not good. TODO: suggest something else].

Then your job is to:

  • Proofread. OCR is good, but not perfect. The true-text PDFs are usually great, but may still have minor errors, so you should proofread no matter what source you have.
  • Ensure headers/titles are inside heading elements, and they are at the correct level. Do not use a heading for it’s font size, use it for it’s semantic meaning! The books title is h1, so the chapters titles are likely to be h2, unless they are inside a part, then the part title is h2 and the chapter titles are h3. Also see the Standard Ebooks’ SEMOS for more information.
  • Ensure paragraphs are inside elements. For EPUBs that means ensuring there isn’t some incorrect breaks in the ’s, this especially happens on page breaks. For plain text adding around each paragraph.
  • Ensure italicized text is inside ^[You may see on this MDN link that is not always the correct element, and that there are other "more appropriate" elements for some things, and this is true. It would be nice if you did use the appropriate element, otherwise I will have to change them, but you don’t have to. See the Standard Ebooks’ SEMOS for when an element is appropriate] elements.
  • Re-create tables with
  • Re-create lists with and elements
  • Ensure footnote/endnote numbers are linked with , the actual footnote should be moved to the end of the chapter or section or to the end of the book. The paragraph element of the footnote should be given an id so that the can reference it with href=#insert_id_here (the hashtag is necessary, do not replace it with the id)

(TODO: Anything else?)

For EPUBs, calibre likes to add a bunch of unnecessary stuff, like <span>s and class="calibre1". So before you start working on it, you may want to remove them.
If you are using Calibre’s ebook editor—or some other editor with regex capabilities—you can remove them by opening up the find and replace, selecting "Regex" in mode (or how that works in your editor), inputting class=".*?" in Find and nothing in Replace and clicking Replace all. Then inputting <\/span> in Find (and still nothing in Replace.)
Also if you are using Calibre’s ebook editor, it has a preview on the right and it may not look good. But don’t worry your job is to ensure the text and HTML is correct, and I’ll make the ebook and ensure it looks good.

Step 3: ???

Once done, send the finished HTML (or EPUB if you’ve been working inside one) over to me. Otherwise if you are making an EPUB for yourself and not just transcribing for ComLib, I suggest you look at Standard Ebooks’ Step by Step guide

Step 4: Profit!

Your EPUB is now ready

43

This user is suspected of being a cat. Please report any suspicious behavior.

31
submitted 1 week ago* (last edited 1 week ago) by edie@lemmy.encryptionin.space to c/slop@hexbear.net

He also talked about femboys. Putting this in slop cuz I really don't know where to put it and it's Gunther

Live from Gunthers POV: https://www.youtube.com/live/7s34sgu7R7o


This user is suspected of being a cat. Please report any suspicious behavior.

14
Neither left nor right. Auth-trans! (lemmy.encryptionin.space)
43
submitted 2 weeks ago* (last edited 2 weeks ago) by edie@lemmy.encryptionin.space to c/games@hexbear.net

Frostpunk is such a liberal game. Not my freedom being taken away by the big bad totalitarianist.

If people didn't become so fucking hopeless EVERY TIME a storm is approaching, I wouldn't care about signing this law. Our city has 15 days worth of food rations stockpiled and 13 days of coal, for a storm that last 2 days! All but the guard towers and the infirmaries are run by automatons. I have 13 automatons just... standing, doing nothing. Everything researched and fully upgraded. What's their problem?? I cannot make it any better than it is!


This user is suspected of being a cat. Please report any suspicious behavior.

73

This is a picture of a 2008 Ford Interceptor from Cumming, Georgia.

Photo found on https://policecararchives.org/georgia/forsyth.html


This user is suspected of being a cat. Please report any suspicious behavior.

5
Revisionist! (tankie.tube)

cross-posted from: https://lemmy.encryptionin.space/post/15009

Completely original by me.


This user is suspected of being a cat. Please report any suspicious behavior.

28
submitted 3 weeks ago* (last edited 3 weeks ago) by edie@lemmy.encryptionin.space to c/main@hexbear.net

Completely original by me.


This user is suspected of being a cat. Please report any suspicious behavior.

33
PSA: Lemmy searches in alt text (lemmy.encryptionin.space)

I am sure many of us know the importance of adding alt text to help the visually disabled, however like many other things intended for the disabled that ends up also being useful for the abled, alt text can also be useful when trying to find something in an old post, as lemmy's search — as bad as it is — also searches in posts' alt text.

So get to transcribing text!


This user is suspected of being a cat. Please report any suspicious behavior.

32

https://commons.wikimedia.org/wiki/File:Chimmie_-_Xenia_the_Linux_Vixen_-_gangsta.svg


This user is suspected of being a cat. Please report any suspicious behavior.

67
I remade the Apology form as an SVG (lemmy.encryptionin.space)
submitted 3 weeks ago* (last edited 1 week ago) by edie@lemmy.encryptionin.space to c/chapotraphouse@hexbear.net

This user is suspected of being a cat. Please report any suspicious behavior.

view more: next ›

edie

0 post score
0 comment score
joined 3 weeks ago