Event box

Hands-on digitizing texts with machine learning and AI

Hands-on digitizing texts with machine learning and AI

Are you interested in extracting text from scanned images—even poor quality images—and learning more about new advances in optical character recognition (OCR)? Join us for a 3-hour workshop on utilizing machine learning and large language models to programmatically OCR images of text. The workshop will take participants through running Python code in collaborative notebooks to access a variety of tools used to OCR texts, including texts that might be poorly scanned or otherwise difficult to read.

This is a participatory workshop and you will have the opportunity to practice along with the instructors, as well as applying skills in exercises on your own. Our goal is that you walk away with the confidence and skills to use the software and address challenges as they arise.

The workshop is open to all VT community members. Some experience with Python is recommended, and you will need access to a Windows, Mac, or Linux computer. Instructions for setting up accounts with Kaggle, Hugging Face, and Llama will be provided before the workshop.

If you are an individual with a disability and desire an accommodation, welcome! Please email library-event-accessibility@groups.office365.vt.edu at least 10 days prior to the event. 

Date:
Wednesday, May 21, 2025
Time:
9:00am - 12:00pm
Location:
Newman 207A
Campus:
Blacksburg Campus
Audience:
    Alumni       Faculty/Staff       Graduate Students       Postdoc       Public       Researchers       Undergraduates  
Categories:
    Workshop       Workshop > Data Science  
Registration has closed.

Presenter

Bipasha Banerjee, Chreston Miller & Jesse Sadler

Event Contact

Profile photo of Bipasha Banerjee
Bipasha Banerjee

AI Research Scientist

Profile photo of Chreston Miller
Chreston Miller

Data & Informatics Consultant

Profile photo of Jesse Sadler
Jesse Sadler

Digital Humanities Trainer and Project Consultant