Sorcero Open-Sourcing Ingestum™ Framework For Unstructured-Data Text Extraction

WASHINGTON, March 22, 2021 /PRNewswire-PRWeb/ — Sorcero Inc., a rapidly growing Washington, DC enterprise AI software startup, announces the release of Ingestum™ (“ingest’em”), a free and open source software (FOSS) unified content ingestion framework that supports sourcing and transformation of a wide variety of data and document types into a uniform document format.

Ingestion of arbitrary and unstructured content formats—PDF files, Microsoft Office® documents, email threads, and so on—presents a challenge in the AI industry. The ingestion market is extremely fragmented. There are many niche players, and most AI firms handle the ingestion of unstructured text in-house. This is the challenge the Ingestum framework meets head-on: it is a methodical, reusable, extensible, and scalable framework for ingesting content, free and open to all.

Written in Python and built around reusable, programmable pipelines, Ingestum—from the Latin word to ingest or toss in—is largely agnostic of both source and output formats; it is designed to be extended through the use of plugins, and it can be deployed as a command-line tool or as a web service. Ingestum integrates existing FOSS projects such as PDFMiner, Google’s Tesseract-OCR Engine, and Mozilla’s Deep Speech speech-to-text engine.

“Data and Analytics executives tell us that unstructured documents are full of data they need but can’t access. We want organizations to benefit from AI and ingestion is a significant barrier. We think open-sourcing Ingestum will democratize ingestion,” said Dipanwita Das, Sorcero CEO & Co-founder.

“Ingestum leverages many existing open source projects, so no one has to reinvent the wheel; it can easily integrate existing workflows, or incorporate existing software as plugins,” said Walter Bender, CTO and Co-founder of Sorcero, who revealed Ingestum yesterday at the LibrePlanet FOSS conference.

Sorcero—recently featured at the LOINC and InsurTech NY conferences—invites IT directors, software engineers, and AI researchers to download and use Ingestum today (git clone https://gitlab.com/sorcero/community/ingestum.git).

About Sorcero:

Sorcero was founded in Washington, DC in 2018 by Dipanwita Das, Richard Graves, and Walter Bender. Sorcero’s Language Intelligence Platform uses domain understanding to power mission-critical decision-making across enterprises in life sciences and insurance. The company’s mission is to inform critical decisions to improve lives through access and understanding of the world’s knowledge. To date, the company has raised $5.4 million in funding from Leawood Venture Capital, WorldQuant Ventures, Castor Ventures (the MIT Alumni fund), and H/L Ventures.

Media Contact

Claiborne Deming, Sorcero, (202) 750-4435, [email protected]

 

SOURCE Sorcero

Sorcero Open-Sourcing Ingestum™ Framework For Unstructured-Data Text Extraction WeeklyReviewer

PR Newswire Technology News

World Reviewer Staff
World Reviewer Staffhttps://weeklyreviewer.com/
The first logical thought has to be "no way". I'm the World Observer! Ill find and share important news all day.

Latest articles

Earnings Disclosure

WeeklyReviewer earns primarily through affiliates and ads. We don’t encourage anyone to click on ads for any other purpose but your own. We recommend products and services often for our readers, and through many we will earn commissions through affiliate programs.

Related articles