Session: Tricks to catch mistakes in code and text

Zabbix employees (hundreds of people from all over the world) generate a lot of public information – open-source code, documentation, web-site data etc.

By using simple, open-source Linux free tools like ‘grep’ and ‘codespell’, I can easily scan this information regularly, catch and fix many mistakes – typos and duplications.

Paid software products that check English text are not designed to meaningfully analyse other types of information like C code or YAML configuration files.

Human review checks also have their own limitations, people are not good with analysing thousands of lines of text on regular basis.

This is where simple checks using tools like ‘grep’ shine. They are very fast, reliable and efficient.

I would like to cover the following parts:

  • ‘codespell’ tool for checking the typos (options to fine-tine the tools, write your own dictionary with your own typos you detected)
  • ‘fuzzy matching’ technique for discovering new typos
  • grepping for duplicate words and lines
  • finding blocks of duplicate data (simian tool, free only for non-commercial or evaluation purposes)
  • grepping for russian characters nearby english characters
  • grepping for non-UTF characters

All those methods – regularly find mistakes in data that Zabbix produces publicly and help us produce high-quality product.

This session will be recorded

Presenters: