Loading Django management commands from custom locations
2024-11-04
Django only allows loading management commands from Django apps - which can be a little annoying. This article describes how this can be circumvented to load management commands from custom locations.
Read More
Finding unused fixtures in your tests
2023-08-05
Do you have a lot of fixtures in your pytest tests? Are you sure all of them are used? Do you use dynamic fixtures? Learn how to detect unused fixtures with pytest-unused-fixtures.
Read More
Good practices for making reproducible open source code
UCL Photonics Society Transferable Skills Workshop Series, November 28, 2024
Learn how to make good quality reproducible code using the good practices and tools of open-source software development.
Aggregating data in Django using database views
EuroPython, July 10, 2024
Aggregating information is a common Django task, but using the aggregate method can be a bit cumbersome and in the case of large database tables, pretty slow as well. I will introduce the library django-pgviews-redux, which adds first-class support for database views (with Postgres), making that task much simpler.
With that library, database views are wrapped around models, meaning you get many of the features you rely on with models for free, like querysets and filtering on those, admin, and any other feature which works with models. Defining a view is almost as simple as defining a model, by specifying what fields there are for the model and defining the SQL.
This talk will walk through examples of aggregation in Django, and then show how one could simplify those examples using the library. Finally, we will get to materialized views as well, which stores the aggregation almost like a table in the database, providing big speed improvements on aggregation on large tables.
Pytest: The Case For Using Classes
PyCon CZ, September 16, 2023
There's many reasons why people love pytest – the simple asserts, modular test setup, easy barrier to entry, the plugin ecosystem.
But at scale, only using function-based tests becomes hard to maintain, and slow to run. At Xelix, our main Django monolith has about ten thousand tests and about four thousand fixtures, so speed and maintainability is of paramount importance.
In this talk I will present several reasons why we love to use classes to organise our tests, and to share fixtures. Using class based tests allows you to be more explicit, in what fixtures are available to a set of tests, and allows sharing fixtures between tests not in the same folder. Using classes enables using fixture scope to greatly speed up your tests, while keeping the fixture namespace clear. And finally, using test classes allows you to do interesting things with inheritance and parametrisation, which would be a pain to do with functions only.
Xelix
Lead Data Engineer
October 2022 ‒ Ongoing
In addition to all the responsibilities of my previous role started leading a small team whose responsibility is to handle the pipeline of data coming to Xelix, importing data from customer's systems and running all our processing and analysis on it. This includes writing & orchestrating the pipeline, implementing algorithms and ML models, and performing the technical part of customer setup. Line managing two team members directly, and mentoring other team members. Communicating often with the product, customer success and implementation teams. As a senior member of the tech team I am also involved in hiring and strategic planning. Later in this role started focusing heavily on developer experience, making improvements to the codebase, processes, infrastructure, and testing.
Software developer
October 2019 ‒ September 2022
Developing the backend of the Xelix platform using Django. Wrote new features, refactored or rewrote large parts of the codebase, scaling from a small number of customers to a much larger number of much larger customers. Introduced coding style checks, and better dependency management early on. Focused a lot on testing, writing custom tools, owning them running fast & reliably. Dealing with DevOps as well, using Terraform, and AWS. Implemented data science team's algorithms and ML models.
Eluvia (formerly SparkTECH)
Full-stack developer
May 2014 ‒ September 2019
Part-time. Developing commissioned and in-house systems in Django and AngularJS. Gained experience with building new systems and maintaining older ones, reviewing code and being the owner of projects. In later years involved with DevOps, pioneering CI, handling automatic deployments and loadbalancing using GitLab CI, AWS and custom-made tools.
Red Hat Czech
Associate software engineer ‒ junior
October 2017 ‒ June 2018
Part-time. Supported by Red Hat to work on my bachelor's thesis to help the Czech Python community. The website naucse.python.cz, enhanced by my thesis focuses on materials for teaching or learning Python.
Eleveo (formally ZOOM International)
Testcase writer
April 2012 ‒ May 2013
Part-time. Writing manuals on how to test complicated call centre applications, and testing them before new releases.
University College London
2018 ‒ 2019, MSc Web Science and Big Data Analytics
A data science degree with a focus on the web and related concepts and technologies. Core modules focused on the structure of the web and retrieving information from it, elective modules focused on data analytics and machine learning, including neural networks. Graduated with distinction.
Fast calculation of the average
path length in large complex graphs
Masters's thesis, available online
The thesis focused on speeding up the calculation of the exact average path length in large graphs, using pruning for purposes of calculating the pair-wise distances. In the second part that algorithm was used to further speed up an approximation using sampling. The algorithm was implemented in C++ with Python bindings and extensively tested.
Faculty of Information Technology,
Czech Technical University in Prague
2015 ‒ 2018, Bc Informatics
A general introduction to computer science, with a later specialisation in Web and Software Engineering, which included technologies used on the web, Big Data databases, searching in web and multimedia, information retrieval and other web techniques. Graduated with distinction and was awarded the Dean's Award for an excellent thesis.
Efficient and secure document rendering
from multiple similar untrusted sources
Bachelor's thesis, available online
A tool for running Python scripts from git repositories securely in separate environments with caching, OSS under MIT license. Primarily used at naucse.python.cz to allow rendering of courses from forked repositories. The thesis was defended with grade A and I was awarded the Dean's Award for an excellent thesis for it.
Archbishop Grammar School in Prague
2007 ‒ 2015
General education, but with a great computer science education as well, in electable subjects and after-school activities, mostly in Python, basics of graph and graphics algorithms, intro to logic programming, language design and interpreters and C++.
pytest-unused-fixtures
Source
2023
Pytest plugin for identifying unused fixtures.
pytest-xdist-worker-stats
Source
2023
Pytest plugin for collecting statistics about xdist workers.
P4A Portal
Source
2022 ‒ 2023
Calculating statistics for the Project For Awesome charity livestream.
modbus2tcp-reader
Source
2018 ‒ 2021
A small project for reading and storing data from electricity meters. Originally for my father's school, ended up being deployed to analyse consumption of several buildings.
BetterSubBox
Source
2014 ‒ 2019
A discontinued tool for YouTube power-users, extending the YouTube subscriptions with extra functionality, originally my graduation project in high school. Written in Python and CoffeeScript, backend in Django, frontend in Angular. Involved work with YouTube's API, asynchronous actions with celery, and optimising cache for performance.
Arca
Source
2018 ‒ 2019
Additional development of the tool Arca which was the principal part of my bachelor's thesis.