Extract Information from Documents/Data

Extract Information from Documents/Data

Extract Information from Documents/Data

Custom AI Solutions for your Business needs

INTRODUCING OCTOWIT AI

Introducing Octowit AI

At Octowit AI, we understand the importance of extracting key information from various data sources and essential documents.
Precision in this step directly impacts the accuracy of subsequent tasks. That's why we leverage the latest AI & Technology research to craft personalized solutions that perfectly suit your needs.

At Octowit AI, we understand the importance of extracting key information from various data sources and essential documents.
Precision in this step directly impacts the accuracy of subsequent tasks. That's why we leverage the latest AI & Technology research to craft personalized solutions that perfectly suit your needs.

At Octowit AI, we understand the importance of extracting key information from various data sources and essential documents.
Precision in this step directly impacts the accuracy of subsequent tasks. That's why we leverage the latest AI & Technology research to craft personalized solutions that perfectly suit your needs.

Solutions

Some of Our Work

We have tested our technology and workflows with most common end document types and raw data.

And it works.

Information Extraction from Images

Custom Shape Detection

Image Component Analysis

OCR + Text correction

{
  "id": 6987,
  "comic_texts": [
    {
      "id": 61713,
      "index": 0,
      "text_raw": "I THOuGHT WE WERE WINNING.",
      "text_cleaned": "I THOUGHT WE WERE WINNING.",
      "bounding_box": [
        187,
        56,
        282,
        106
      ]
    },
    {
      "id": 61714,
      "index": 1,
      "text_raw": "THIS IS NOT WHAT YOU EXPECTED Is It?",
      "text_cleaned": "THIS IS NOT WHAT YOU EXPECTED, IS IT?",
      "bounding_box": [
        433,
        346,
        525,
        430
      ]
    },
    {
      "id": 61715,
      "index": 2,
      "text_raw": "I HAVE WALKED AcrOss a ThOuSanD BattlEFIeLDs, young SHIKHANDI.NOT ONE'OF THOSE WAS A PLACE OF JOYOR GLORY.",
      "text_cleaned": "I HAVE WALKED ACROSS A THOUSAND BATTLEFIELDS, YOUNG SHIKHANDI. NOT ONE OF THOSE WAS A PLACE OF JOY OR GLORY.",
      "bounding_box": [
        663,
        475,
        861,
        587
      ]
    },
    {
      "id": 61716,
      "index": 3,
      "text_raw": "WAR IS THE DARK REFLECTION OF MAn's soul. it must be FACED,ACKNOWLEDGED BUT IT SHOULDNOT BE CELEBRATED,",
      "text_cleaned": "War is the dark reflection of man's soul. It must be faced, acknowledged, but it should not be celebrated.",
      "bounding_box": [
        643,
        629,
        849,
        723
      ]
    },
    {
      "id": 61717,
      "index": 4,
      "text_raw": "BuT WHAt OF ALL The stories? how The pOets sINg OF GLORIOUS DEEDS ANDGREAT TRIUMPHS",
      "text_cleaned": "But what of all the stories? How the poets sing of glorious deeds and great triumphs.",
      "bounding_box": [
        433,
        803,
        598,
        902
      ]
    }
  ],
  "datetime_modified": "2023-12-22T13:44:12.936237+05:30",
  "datetime_created": "2023-12-22T13:44:12.936255+05:30",
  "panel_num": 4,
  "panel_composite_image": "/121__18days%2325/composite-images/panel_004.jpg",
  "meta": {
    "panel_dims"

Information Extraction from Pdf

Document Structure Analysis

Component Grouping

OCR + Text correction

{
  "Date": "11/28/84",
  "PROJECT NAME": "GEMINI AWARENESS ATTITUDE & USAGE MONITOR - PORTLAN SUPPLIER: BURKE MARKETING RESEARCH NP-75",
  "Previous $ Commitments This Project": "$220,000",
  "Adjusted Total Cost of Project": "$221,000",
  "Field Start": "Aug. 1984",
  "Field Complete": "Dec. 1984",
  "Final Report Due": "Jan. 1985",
  "Total Area Budget": "$3,488,000.00",
  "Current Balance Available": "$17,195.66",
  "TEIS Change": "-$800.00",
  "New Balance": "$16,395.66",
  "Committed to Date (Current Year)": "$3,493,204.34",
  "Approved By": "TE Albert",
  "Project No": "1984-175NP",
  "Account Name": "New Products 681925147"

Information Extraction from Data

Custom Key-Value extraction

Key normalization

Value normalization

{
  "Color": "Brown",
  "Projection": {
    "unit": "inches", 
    "value": "16.15"
  },
  "Diameter": {
    "unit": "cm", 
    "value": "0"
  },
  "finials": true,
  "length": {
    "type": "range",
    "min": {
      "unit": "cm",
      "value": "304.8"
    },
    "max": {
      "unit": "cm",
      "value": "431.8"

Information from Images

  • Custom Shape Detection

    Image Component Analysis

    OCR + Text correction

{
  "id": 6987,
  "comic_texts": [
    {
      "id": 61713,
      "index": 0,
      "text_raw": "I THOuGHT WE WERE WINNING.",
      "text_cleaned": "I THOUGHT WE WERE WINNING.",
      "bounding_box": [
        187,
        56,
        282,
        106
      ]
    },
    {
      "id": 61714,
      "index": 1,
      "text_raw": "THIS IS NOT WHAT YOU EXPECTED Is It?",
      "text_cleaned": "THIS IS NOT WHAT YOU EXPECTED, IS IT?",
      "bounding_box": [
        433,
        346,
        525,
        430
      ]
    },
    {
      "id": 61715,
      "index": 2,
      "text_raw": "I HAVE WALKED AcrOss a ThOuSanD BattlEFIeLDs, young SHIKHANDI.NOT ONE'OF THOSE WAS A PLACE OF JOYOR GLORY.",
      "text_cleaned": "I HAVE WALKED ACROSS A THOUSAND BATTLEFIELDS, YOUNG SHIKHANDI. NOT ONE OF THOSE WAS A PLACE OF JOY OR GLORY.",
      "bounding_box": [
        663,
        475,
        861,
        587
      ]
    },
    {
      "id": 61716,
      "index": 3,
      "text_raw": "WAR IS THE DARK REFLECTION OF MAn's soul. it must be FACED,ACKNOWLEDGED BUT IT SHOULDNOT BE CELEBRATED,",
      "text_cleaned": "War is the dark reflection of man's soul. It must be faced, acknowledged, but it should not be celebrated.",
      "bounding_box": [
        643,
        629,
        849,
        723
      ]
    },
    {
      "id": 61717,
      "index": 4,
      "text_raw": "BuT WHAt OF ALL The stories? how The pOets sINg OF GLORIOUS DEEDS ANDGREAT TRIUMPHS",
      "text_cleaned": "But what of all the stories? How the poets sing of glorious deeds and great triumphs.",
      "bounding_box": [
        433,
        803,
        598,
        902
      ]
    }
  ],
  "datetime_modified": "2023-12-22T13:44:12.936237+05:30",
  "datetime_created": "2023-12-22T13:44:12.936255+05:30",
  "panel_num": 4,
  "panel_composite_image": "/121__18days%2325/composite-images/panel_004.jpg",
  "meta": {
    "panel_dims"

Information from Pdf

  • Document Structure Analysis

    Component Grouping

    OCR + Text correction

{
  "Date": "11/28/84",
  "PROJECT NAME": "GEMINI AWARENESS ATTITUDE & USAGE MONITOR - PORTLAN SUPPLIER: BURKE MARKETING RESEARCH NP-75",
  "Previous $ Commitments This Project": "$220,000",
  "Adjusted Total Cost of Project": "$221,000",
  "Field Start": "Aug. 1984",
  "Field Complete": "Dec. 1984",
  "Final Report Due": "Jan. 1985",
  "Total Area Budget": "$3,488,000.00",
  "Current Balance Available": "$17,195.66",
  "TEIS Change": "-$800.00",
  "New Balance": "$16,395.66",
  "Committed to Date (Current Year)": "$3,493,204.34",
  "Approved By": "TE Albert",
  "Project No": "1984-175NP",
  "Account Name": "New Products 681925147"

Information from Data

  • Custom Key-Value extraction

    Key normalization

    Value normalization

{
  "Color": "Brown",
  "Projection": {
    "unit": "inches", 
    "value": "16.15"
  },
  "Diameter": {
    "unit": "cm", 
    "value": "0"
  },
  "finials": true,
  "length": {
    "type": "range",
    "min": {
      "unit": "cm",
      "value": "304.8"
    },
    "max": {
      "unit": "cm",
      "value": "431.8"

Information Extraction from Images

Custom Shape Detection

Image Component Analysis

OCR + Text correction

{
  "id": 6987,
  "comic_texts": [
    {
      "id": 61713,
      "index": 0,
      "text_raw": "I THOuGHT WE WERE WINNING.",
      "text_cleaned": "I THOUGHT WE WERE WINNING.",
      "bounding_box": [
        187,
        56,
        282,
        106
      ]
    },
    {
      "id": 61714,
      "index": 1,
      "text_raw": "THIS IS NOT WHAT YOU EXPECTED Is It?",
      "text_cleaned": "THIS IS NOT WHAT YOU EXPECTED, IS IT?",
      "bounding_box": [
        433,
        346,
        525,
        430
      ]
    },
    {
      "id": 61715,
      "index": 2,
      "text_raw": "I HAVE WALKED AcrOss a ThOuSanD BattlEFIeLDs, young SHIKHANDI.NOT ONE'OF THOSE WAS A PLACE OF JOYOR GLORY.",
      "text_cleaned": "I HAVE WALKED ACROSS A THOUSAND BATTLEFIELDS, YOUNG SHIKHANDI. NOT ONE OF THOSE WAS A PLACE OF JOY OR GLORY.",
      "bounding_box": [
        663,
        475,
        861,
        587
      ]
    },
    {
      "id": 61716,
      "index": 3,
      "text_raw": "WAR IS THE DARK REFLECTION OF MAn's soul. it must be FACED,ACKNOWLEDGED BUT IT SHOULDNOT BE CELEBRATED,",
      "text_cleaned": "War is the dark reflection of man's soul. It must be faced, acknowledged, but it should not be celebrated.",
      "bounding_box": [
        643,
        629,
        849,
        723
      ]
    },
    {
      "id": 61717,
      "index": 4,
      "text_raw": "BuT WHAt OF ALL The stories? how The pOets sINg OF GLORIOUS DEEDS ANDGREAT TRIUMPHS",
      "text_cleaned": "But what of all the stories? How the poets sing of glorious deeds and great triumphs.",
      "bounding_box": [
        433,
        803,
        598,
        902
      ]
    }
  ],
  "datetime_modified": "2023-12-22T13:44:12.936237+05:30",
  "datetime_created": "2023-12-22T13:44:12.936255+05:30",
  "panel_num": 4,
  "panel_composite_image": "/121__18days%2325/composite-images/panel_004.jpg",
  "meta": {
    "panel_dims"

Information Extraction from Pdf

Document Structure Analysis

Component Grouping

OCR + Text correction

{
  "Date": "11/28/84",
  "PROJECT NAME": "GEMINI AWARENESS ATTITUDE & USAGE MONITOR - PORTLAN SUPPLIER: BURKE MARKETING RESEARCH NP-75",
  "Previous $ Commitments This Project": "$220,000",
  "Adjusted Total Cost of Project": "$221,000",
  "Field Start": "Aug. 1984",
  "Field Complete": "Dec. 1984",
  "Final Report Due": "Jan. 1985",
  "Total Area Budget": "$3,488,000.00",
  "Current Balance Available": "$17,195.66",
  "TEIS Change": "-$800.00",
  "New Balance": "$16,395.66",
  "Committed to Date (Current Year)": "$3,493,204.34",
  "Approved By": "TE Albert",
  "Project No": "1984-175NP",
  "Account Name": "New Products 681925147"

Information Extraction from Data

Custom Key-Value extraction

Key normalization

Value normalization

{
  "Color": "Brown",
  "Projection": {
    "unit": "inches", 
    "value": "16.15"
  },
  "Diameter": {
    "unit": "cm", 
    "value": "0"
  },
  "finials": true,
  "length": {
    "type": "range",
    "min": {
      "unit": "cm",
      "value": "304.8"
    },
    "max": {
      "unit": "cm",
      "value": "431.8"

Capabilities

Harness the Power of Latest AI

Powerful advancements in the field of AI choosen meticulously by our expert team, experimented and tested in production to bring the best building blocks to you.

Capabilities

Harness the Power of Latest AI

Powerful advancements in the field of AI choosen meticulously by our expert team, experimented and tested in production to bring the best building blocks to you.

Capabilities

Harness the Power of Latest AI

Powerful advancements in the field of AI choosen meticulously by our expert team, experimented and tested in production to bring the best building blocks to you.

Generative AI

Generative AI

Generative AI

Unlock the full potential of cutting-edge AI with our expertise in OpenAI's GPT, Google's Bard, and leading open-source models like Meta's LLaMA and Mistral.

Computer Vision

Computer Vision

Computer Vision

Transform visual data into insights with our AI-driven computer vision services. Custom solutions for real-time analysis and smarter business decisions.

Multi-lingual Support

Multi-lingual Support

Multi-lingual Support

  • Hindi

    Kannada

    Tamil

    Punjabi

    Gujarati

    Assamese

    Bengali

    Bhojpuri

    Malayalam

    Marathi

    Odia

    Telugu

    English

OCR

OCR

OCR

Our LLM-based OCR services offer unparalleled accuracy in text recognition, transforming scanned documents into editable formats with ease

Machine Learning

Machine Learning

Machine Learning

Harness the power of Deep Learning and NLP with our Machine Learning services for advanced information extraction to convert complex data into strategic insights.

Our Technology Stack

Our Technology Stack

Our Technology Stack

Schedule a Strategy Call Today.

Schedule a Strategy Call Today.

Schedule a Strategy Call Today.

Get the best AI solutioning.

Get the best AI solutioning.

Get the best AI solutioning.

Drop us a Message

Drop us a Message

Drop us a Message