Web App 2024 ~ 2022

2024-12-03 JSON Processor :

  1. Select folder first :
    . will automatically get all JSON files inside the folder
  2. Toggle [Write Next] button to automatically process next file
  3. Processed data will be write to a newly created or existed SQLite DB

2024-11-29 Add Chapter Header & Title Exraction Feature :

  1. Processing selected pdf file with Mozilla PDF.js:
    . the pdf file will also be opend in pdf viewer and flip pages synchronously
    . using self developed algorithm to detect header and subtitle of each chapter
    . using Compromise-NLP to detect and extract keywords, terms and numbers
    . (optionally format text in paragraph for a better review)
  2. The processed paragraphs will be saved in JSON
  3. Using another web app <pdf processor> to write data into SQLite DB

2024-11-16 Add Table & Image Exraction Feature :

  1. Extract Table within pdf-table-extractor
  2. Extract image within PyMuPDF or manually export using Acrobat
  3. The extracted item will be inserted to the correpsond paragraph

2024-09-22 WPF PDF Extrator Test :

  1. Click open button or drag pdf file to the app
  2. Process pdf by pages and paragraphs :
    . use self developed algorithm to detect and format text by keywords, terms and numbers
    . the pdf file will also be opend in pdf viewer and flip pages synchronously
  3. Click on the paragraph to generate a new title for chapter