Building a high performance XML file parser service

There are special wafer map configuration files that need to be processed before any wafer test data can be visualized in the front end application. The purpose of the service was to search for these XML Configuration files from a defined location and parse them every hour. Files go through a number of validations and then processed accordingly and stored to an Oracle database. I needed to deploy a stable service that is able to process large amounts of files in less time. After lots of reading blogs and recommendations from the Python community I decided to go with this setup.

Major features:

  • Refactored code to allow switching out any part of the tech stack if necessary
  • Setup Celery Flower plugin in order to monitor worker uptime and jobs processed
  • Use ElementTree library for parsing XML
 
Notion image