HACKER Q&A
📣 bckr

How to transcribe 1000s of handwritten notes


I have 10 years’ worth of journals.

My handwriting is not great!

None of the off the shelf solutions come even close to recognizing my handwriting.

Can you think of anything better than just opening every single file and manually transcribing it?

I have been thinking about training a model to first divide the images into lines of text. Then, it will be easier to transcribe, and automatically those transcriptions will be associated with areas of the image, in case I figure out a good handwriting model.


  👤 throwaway211 Accepted Answer ✓
Can you read them? Speech to text perhaps. That can also be done locally.

If a note's a minute, 1000 notes are around 16 hours of reading. Scale time needed depending on if it takes less or more than a minute to read. Add a note reference to the start of each recording, like a zettelkasten, so the scanned file, recording and text cross-reference.

If assessing other solutions, that's at least an upper bound on the cost of any other solution.


👤 GianFabien
I have about 5000 pages of research notes. I have found that the quality and usefulness of the material varies greatly. Much of the older material is of little relevance with the passing of time. As futile it may seem, I'm finding that re-reading and summarizing rather than straight transcribing is effective. I'm refreshing my memory of what I did discover and only typing up what is relevant now. Fortunately I'm a fast touch typist, so I can stare at the handwritten page and type; only glancing at the screen after a paragraph or two. Two things I find useful to retain are the dates of the original materials and bibliographic references.

👤 simonw
I'll throw in another vote for AWS Textract, I've had great results for it against 19th century handwriting: https://simonwillison.net/2022/Aug/25/sfms-archive/

👤 TheMiddleMan
I've found decent success with Googles Cloud Vision API for transcribing cursive writing on the backs of 1000s of family photos.

https://cloud.google.com/vision/docs/handwriting

I threw together a basic UI with the transcribed text in an editable area next to the image where I would edit any adjustments as it wasn't 100% perfect.


👤 scovetta
Take photos of them, or cut the binding and scan them all, and then feed the work out to mechanical turk?

👤 pcherna
I have a hundred or so pages of handwritten letters in Hungarian, but got useless results from AWS Textract and from transkribus. However, I also have about the same number of pages (written by the same person) that I have already gotten hand-transcribed into Hungarian. How might I approach using the already-transcribed stuff to train some kind of AI model or text-recognition model to work on the rest?

👤 user_agent
A hint that might help at least partially: novadays for managing digital and handwritten notes I juse Joplin, but before that I was an avid Evernote user. Having a paid plan active gives you access to Evernote's OCR function on their backend. I had a lot of handwritten notes uploaded as attachments to Evernote, and I remember that despite my handwritnig being awful their softwre was able to parse it and allow me to, among others, perform quite advanced searches on my handwritten notes. I'm not sure if there's a way to make Evernote's OCR backend work for you in scenarios more elastic that what it's been built for, but I wanted to menion that there's this unique OCR tech that I think does far better job that any standalone OCR software I tried (for my handwriting style which I consider awful). It might be worth researching further for you.

👤 wriggler
Have you tried https://www.handwritingOCR.com?

It is designed to do exactly what you are looking for, and has been used very successfully by many others for that same purpose (I’m the founder).

It is not as cheap per page as Google Document AI, for example, but it does tend to be much more accurate for handwriting, so usually ends up cheaper when editing time is factored in.

If you find it does work well with your handwriting, please get in touch and I can try to fit the pricing to your use case.


👤 dougdimmadome
I was in a similar situation last month. Not quite 1000s of pages but close to 100. Just enough to make typing them out seem like too much work.

I found an app online (I wont even name it) which promised incredibly accurate handwriting transcription. Signed up and found it was true, but they were just sending images directly to chatGPT and returning the result and then charging a fee on top.

I started working on an open source version. It took me only a few hours and I'm sure anyone else could pull it together. used chatGPT example code to connect to API and send an image with a prompt along the lines of "please transcribe the text in this image and return only that, nothing else". even with that instruction it still sometimes prefaces with "sure! I can do that.", which I think is the AI equivalent of Homer Simpson writing "ok" in the "please leave this section blank" part of the form. Anyhoo, I had a basic job queue written, pull in images in order of file creation date and fire them off, append the text to a text file after. There was some cleanup of the file required (weird line breaks) but it saved me days of typing.

You still need a chatGPT API key for it but it does take a good bit of the work out.

At the moment I'm investigating using a free local model. LLava is just as accurate but takes longer than sending it to ChatGPT. but if you were worried about burning credits it would be the way to go.


👤 tmaly
I record myself reading my hand written notes, then I just upload the mp3 of the recording to MS 365 to transcribe.

I put special stop words like highlight/return so then I can post process and ensure the markdown formatting looks good.


👤 imvetri
I have my 3 years of paper, I wanted to use it to experiment building a black mass program. A blackmass program is a concept which will yield to a black mass in the computer, capable of building conceptual cool tech like automating your daily work, self experimentation, self learning etc.

My notes will have instructions to reach the black mass state, a computer image scanner will try to learn my handwritings, take them as instructions, connect dots etc.

The design of this system is cryptic and challenging. because, side effect to create a computational program will result in a circling thoughts for me. And its hard for me to convert it into an action.

Taking that as an inspiration, this program is a circling program, which means, it will constantly spiral upwards in a value that is definitive to its actions in the past.

All my notes has information or points or ideas about this fictional concept. I burned the notes which were repetitive, kept the rest.

When I did that, It created more head space for me. The headspace, helped to solve problems and have more space for more learnings.


👤 wilabroard
For anyone whose handwritten notes have equations or pictures, Mathpix is stellar. Their APIs can take PDFs as input and return markdown with latex and embedded images. The handwriting recognition is pretty good on my cursive -- good enough anyway that a plain old LLM like Llama 3 can fix the typos.

(Likely under the hood Mathpix has done exactly what you're proposing, with image segmentation, text/image/math classification, then transcription.)

I've been using an Apple Shortcuts automation that turns my handwritten PDFs into notes in Obsidian, with the transcription up top and the PDF embedded below. Could pretty easily be adapted to turn a library of PDFs into a folder of Obsidian markdown notes. Here's a writeup: https://riddle.press/a-marriage-between-handwritten-notes-an...


👤 praving5
If those notes are really worthy and meaningful to you, then hire someone to type them out for you. If there is something that money can buy, then save your time!

👤 freddealmeida
I built this firm a decade ago. https://www.cogent.co.jp/en/

Works with English and Japanese. Sadly I'm no longer with the team there but the work is solid. Try it out.


👤 mariocesar
It seems like using speech-to-text is a faster alternative. You can also consider outsourcing the work. I know abbyy.com offers a service for this. Even though you may not be their target market, they have services for implementing hybrid machine learning and data entry solutions.

If you're into dreaming up cool solutions, you could try using smart pens or tablets to write stuff and then teach a model to recognize your handwriting. But for now, it's just a dream.


👤 ant6n
Scan into pdf and organize them, keep as PDF.

You have to think about what your goal is. Handwritten notes can be perfectly digitized into handwritten notes. What do you need the ocr for? Publishing? … transcribe what you need, or better, rewrite.

Searching? As you scan, make a basic index so that you can refer to the notes. Organize the folders properly with your notes, use a useful naming scheme.


👤 constantinum
I'm unsure how recognisable your handwriting is, but the following tech understood mine.

Try LLMwhisperer[1] pdf extraction API. You are only one "curl" command away from extracting your handwritten text.

The best thing is it preserves the layout of your notes, which means it can keep tables as tables and lists as lists.

Check this screen grab for extracting handwritten notes > https://imgur.com/fXk0tcR

[1]: https://llmwhisperer.unstract.com/ [2]: Try it with your document here > https://pg.llmwhisperer.unstract.com/

[edited] added links


👤 sjhaba
Have you tried chatgpt? 10k image requests should be pretty cheap

👤 tcsenpai
Theoretical solution: train a model on your handwriting. There should be plenty of easy (relatively) to use apps and frameworks for that.

It will take time but you will have a pretty tailored solution.

Also of course: first of all try to process the images so that they only are white and black (not greyscale, actual B/W pictures)


👤 canucker2016
How about creating a crowdsourced captcha service?

Take scans of your journal pages, split the jpegs/pics into word fragments, display a couple of fragments to captcha clients, generate completed journal entries when the consensus gets reasonably high for each word fragment.

Not sure how captcha services start from scratch - probably ask around/check with google search.

Privacy goes out the door, but you should be able to show disjointed word fragments so no one could reconstruct enough of a single journal entry to expose your more personal info unless they were very determined. Or maybe split the scans into individual letter fragments instead?

Then monetize this for other people in the same situation...


👤 hm-nah
I’ve had good success with:

1. Scan them (or take photos of each page) 2. put them files in a directory 3. Make a Python script that sends them to OpenAI GOT-4o 4. Store the text as a new file in the directory.


👤 ZunarJ5
Archaeologist tool, you'll want to fine tune it for yourself.

https://readcoop.eu/transkribus/


👤 f_k
Shameless plug: https://getsearchablepdf.com

There's a free trial so you can check if it works for your handwriting.


👤 workergnome
I know you've said you've looked at off-the shelf tools, but in that did you consider https://www.transkribus.org/? It's a tool designed for reading historical, hand-written documentation—gets used a lot in archives and historical studies. Might be worth an evaluation to see if your handwriting is not great in similar ways to Dutch bankers from the 18th century.

👤 kjfarm
I've been playing with https://huggingface.co/microsoft/trocr-base-handwritten which has been pretty good so far. I want to take it and fine-tune on my own handwriting. For equations, I either use mathpix or just type them manually.

👤 piloto_ciego
I just tried ChatGPT on my handwritten notes, OCR can very seldom recognize my handwriting and it nailed it. It’s cheap, you should give that a shot.

👤 radiantspace
If you use Telegram, you can just voice message those to https://t.me/gienjibot, it uses OpenAI's Whisper under the hood, so recognition is superb and also you can immediately fix grammar with it. And yes, i'm both the creator of the tool and the happy user.

👤 Sandr44
I have scanning my handwritten notes also on my todo-list, some of them are even taken digitally. I have noticed that the offline ocr on Samsung (or maybe on Android devices generally) is pretty good, even with characters that don’t exist in English. Unfortunately there don’t seem to be implementations for batch scanning with Android handwriting ml kit or Samsung vision ocr

👤 heavyset_go
You might be able to manually transcribe some of the notes and then fine tune an existing handwriting recognition model using them.

👤 bcrl
Amazon's Textract seems to do a decent job on my horrific scribbles, and is far better than any of the open source OCR tools I tried. To get started quickly, try using Textractor: https://github.com/Artikash/Textractor

👤 angra_mainyu
Just scan them over the course of a few months, spending a couple of hours a day.

If your notes are anything like mine there might be arrows, drawings or text effects like underlines and circling of words that you'd want to conserve.

You can later ocr the whole thing.


👤 t312227
hello,

imho. (!)

* if you have a lot of "uniform" pages - read something like A4 -, get yourself a scanner with an automatic sheet-feeder

or throw some rainy-weekend afternoons on it & scan your notes with some decent SOHO scanner

* don't get too excessive with resolution, 400+ pixels/inch are enough for OCR ...

i always scan with 1200 and reduce the images to 600 px via simple batch-processing / for example imagemagick "convert".

* get yourself a decent OCR software, which is able to read your notes ...

i'm a big fan of abbyys "finereader", but sadly its prohibitively expensive ... ;)

idk how well FOSS OCR software a la tesseract works for hand-written notes.

* create pdfs with automatically detected text in the background for search and the scanned image of the notes.

it additionally generates XML-metadata & from there: whatever you want (web frontend ... :)

just my 0.02€


👤 jyriand
Have you tried Appke notes. It does a pretty decent job. Here is a example https://youtu.be/eoIIUpdhKZs?si=PXWdhTt0DmFjbrLs

👤 Parrot9141
I have a similar situation (especially when it comes to the handwriting) and am now trying to train my own tesseract model, which with around 12 pages of manually transcribed content starts to work.

👤 IanCal
A really important question here is why?

That will guide what's important. Inaccuracies aren't as much of a problem if you're using it as a search index where you'll return the image of your writing.


👤 SrFil
Which off the shelf solutions have you tried? https://www.transkribus.org/ is generally pretty good with hard to read texts.

👤 1970-01-01
I would approach it like this: how long does it take to type out one journal? How long does it take to research, trial, config, use, retweak, retry, and finally confirm one of a half dozen OCR solutions? Will your chosen solution(s) be available for another 10 years?

I'm sorry but the type it out solution seems the the best choice. You will probably remember something interesting by doing it that way.


👤 fred_is_fred
Would this work with Mechanical Turk? I wonder how much it would be.

👤 999900000999
Take a picture, upload it to chat GPT and see what happens.

If it works then scan all the pages and run though it with a script.

Shouldn't take you more than about an hour to code ( with Chat GPT!) in Python.


👤 pininja
Which off the shelf solutions have you tried?

👤 llama_person
If you have access to a GPU, try a vision-language model using Ollama, and feed it your notes. Might work out!

👤 65
AWS Textract has worked better for me than the other cloud OCR solutions.

👤 spaceship__sun
Heard of GCP Document AI? And if you're rich, use gpt4o.

👤 yosito
Pay someone on Upwork or Fiver to transcribe them for you.

👤 meristohm
Why do you want to transcribe the journals?

👤 brudgers
Can you think of anything better than just opening every single file and manually transcribing it?

No, because the work of manual transcription is a way of telling if transcribing them is worth doing. Or maybe pay someone to transcribe it. Spending money is also a good way to tell if something matters (assuming you have sufficient money).

Orthogonally, maybe building a system is what you really want to do (for many people that would be more enjoyable than revisiting old journal content).

Finally, starting from hand transcription is an entry point into rewriting what you wrote. Rewriting is writing and if there's publication on your roadmap, you will be rewriting anyway.

There's no easy way to write well. Good luck.


👤 hamishleahy
Try using the RATH analyser from github

👤 chaos_emergent
have you tried using gpt-4o? It's pretty incredible at recognizing handwriting.

👤 sid-
live text is an iOS feature you could experiment with

👤 dankwizard
in the time its taken you to look into this and procrastinate, you could have done it by hand

greener pastures