UGE
/ L2 / Intro to relational databases / Python project prototype
Background and goals
The introductory course to relational databases ends with a
project illustrating the use of a relational database together with
a simple CRUD web application.
The Python language could be a better fit compared to PHP for
this kind of assignment as both students and the teaching staff are
already familiar with the former from preceding units.
To showcase Python as a platform suitable for this purpose, a
small demo project specified in ./demo-project-specs.md
will be implemented.
The present document aims to motivate the functional and
technical choices which have been made.
Interpretation of the
specifications
Undefined aspects of the specifications are clarified as follows,
with the intent of keeping the project as simple as possible:
- the platform only deals with one currency (simoleons,
arbitrarily),
- the platform only supports one language (English,
arbitrarily),
- each user only has one account/wallet/balance (all three terms
synonymous),
- there is no notion of financial order (transactions are
immediate),
- only the effective transactions are logged, the rejected ones
are not,
- fees are billed on top of internal transfers instead of taking a
cut.
Database model
Entity-relationship model
The database model is fully described as types, tables and
constraints in ./sql/tables.sql. The file is annotated
with comments and remarks explaining the design choices and
highlights some additional constraints which remain to be enforced
at upper levels.
An entity-relationship diagram of the model can be generated from
the SQL table descriptions using a tool like SQLFairy (note:
SQLFairy crashes
with recent PostgreSQL dialects, so it is not usable here).
Consistency
The consistency of the account balance and the financial
transaction log can be ensured through multiple means such as
executing modification and insertions in SQL transactions at the
application level, or using triggers within the database itself. The
former option is chosen over the latter to avoid implicit
actions.
Additional constraints to prevent alterations of the financial
transaction log such as deletions are omitted here as the web
application will act as the sole gateway to interact with the
database and will offer no way of doing this kind of operations. For
the same reason, the use of stored procedures was not retained.
Web application
Interface with the
database
Object-Relational Mappers (ORM)
such as SQLAlchemy are
sometimes used to partially or fully abstract the database and its
entities behind objects in the Object-Oriented Programming (OOP)
sense. With those tools, it is possible to generate tables, indexes,
and constraints from annotated classes. They also allow automatic
querying and implicit joins to perform information retrieval, and
inserting and updating records in the database directly by setting
fields on linked objects.
However, the object-relational
impedance mismatch problems often render the use of ORMs
impractical especially for complex queries, more often than not
resulting in the developers bypassing the ORM altogether. The tools
also tend to introduce performance issues due to the implicit
queries they generate in situations in which developers do not
always expect them to.
In the context of an database course aiming to teach SQL, such
level of abstraction is also a hindrance by itself. Therefore, the
use of an ORM has been excluded for this project.
More direct and less intricate forms of mappings, such as
transparent result tuples to Python immutable named tuples
conversion, are preferred instead. Features of this kind, which are
directly provided by database drivers like psycopg, are often sufficient by
themselves.
It remains desirable to segregate SQL queries into their own
modules in order to follow the Single-Responsibility Principle (SRP).
Fortunately, libraries such as embrace allow easily
deriving Data Access Objects (DAO)
from sets of annotated raw SQL queries, exposing them as callable
Python functions.
CGI scripts vs. Services
The most simple and primitive way of making Python programs
accessible from a web browser would be through the Common Gateway
Interface (CGI).
Through it, the user is able to execute the scripts by visiting a
matching URL, to then visualise its output directly in their
browser.
The other approach is to make the program a server application as
a persistent service process, in charge of accepting, handling and
replying to network requests by itself.
The former, identical to the way the oldest PHP scripts are run,
has become less popular in favour of the latter, mainly for
performance and maintainability concerns for real applications
outside of trivial, independent, and self-contained scripts. For
those reasons, this project will be implemented as a service
process.
Comprehensive
vs. Light-weight frameworks
While bare Python could be used to create such a web service, it
is desirable to work at a higher abstraction level to focus on the
application-specific features rather than the standard protocol
implementation details.
Python frameworks such as Django offer comprehensive
solutions for building web applications based on the
Model-View-Controller (MVC
or MTV/MVT) design pattern, ensuring separation
of concerns while abstracting lower level logic. This kind of
complete solutions provide routing logic to map requests to
user-defined handlers, and are integrated with a Object-Relational
Mapper as well as with a templating engine to generate HTML pages
dynamically.
Other more light-weight frameworks such as Flask or the more
recent FastAPI instead
focus on the first part, taking care of unpacking and routing
requests to the user-defined handlers, while leaving the rest to the
application developers. This approach does not fully impose an
entire environment, and allows better composability with libraries
which can be freely chosen.
Because the use of an ORM is not desirable in this project for
the reasons detailed in a previous section, the choices of
frameworks is limited to these light-weight frameworks. Here,
FastAPI is preferred over Flask due to its more modern architecture,
using parameters and dependency
injection over thread-local global variables. The chosen
templating engine is Jinja for its
simplicity.
Project structure overview
./sql/tables.sql: database initialisation statementsqueries.sql: annotated SQL queries (from which the
DAO is derived)
./app/app_database.py: database connection pool and
transaction helperapp_sessions.py: (signed cookies) session data
management helpersapp_templating.py: template rendering helperapp_{account,wallet}.py: page-specific request
handlersapp.py: FastAPI web application entry point
./templates/: Jinja HTML templates./static/: static web resources (stylesheets)./flake.nix: project runtime and development
environment description
Security considerations
SQL injections are prevented by using proper query parameters
substitution, automatically handled by the embrace and psycopg
libraries.
Injections in rendered views are prevented by the automatic
sanitisation of inserted variables by the Jinja templating
engine.
Passwords are salted and hashed when stored in the database
to ensure some minimal protection of the data at rest.
The hashing is handled by the passlib
library, which also covers algorithm migrations.
Cross-Site Request Forgery (CSRF)
attacks are mitigated through the conjunctive use of POST requests
for user actions and SameSite
restrictions for session cookies. (note: this will become a
sufficient protection only when support in browsers will become
ubiquitous).
In its current state, the application does not implement any
kind of rate limiting. Such restriction would be needed for real
world applications in order to mitigate account password brute-force
attacks, but also to prevent users from avoiding transfer fees by
sending many small unbilled transactions.
Going further
Form validation could be refactored to make constraints more
easily composable and to improve error reporting to users with a
per-field granularity.
Error handling could be improved by the use of exceptions to
be turned into user-friendly error messages in a generic way, either
through decorators or with a middleware.
Request handlers could be made fully asynchronous after
migrating to psycopg3 to improve performance under a large amount of
requests.
Database schema migration will need to be implemented to ease
the deployment of subsequent versions of the application.
Functional project
improvements
A minimum transaction amount needs to be introduced to
prevent users from dodging the transaction fees by sending small
unbillable amounts.
The financial transaction log on the wallet page gets longer
over time and needs to be properly paginated.
Some stronger form of authentication could be used, either by
delegating the authentication to some other provider, or by adding
multiple-factor authentication.
Users should be offered the possibility to modify their
account data, delete their account, and to download all the data
concerning them.
The creation of new user accounts could require some email
validation. Users could be notified via email of incoming money
transfers.
The user interface could be translated into multiple
languages with the help of an internationalisation/localisation
library such as Babel.
Development environment
The development and execution environment is fully described
using the Nix language
and package manager as a Nix Flake in
./flake.nix. This allows creating reproducible
environments containing all the software dependencies of the program
itself, as well as optional development tools.
(Note: this projects requires Nix version 2.4 or
higher).
This environment can be used on NixOS, MacOS, or any Linux system
having Nix installed. In principle, Nix should also be usable on
Windows through the Windows Subsystem for Linux (WSL)
compatibility layer. Full virtual machines and containers can also
be derived from the same description file.
All the commands in this section have to be run within the
provided development shell, which can be entered by running the
following command at the root directory of the project:
Local database
The Nix Flake development shell provides its own self-contained
PostgreSQL server, configured to operate independently of any other
instances running on the same system.
All data are written to the
./development_database/pgadata directory. The database
server can be initialised by running the following command:
initdb --no-locale --encoding UTF8 --auth-host reject --auth-local peer
The local development PostgreSQL server can then be started by
running the following command, with the $PGHOST
environment variable automatically set by the development shell:
postgres -h "" -k "$PGHOST" -d 2
This server listens to local requests through a UNIX domain
socket located at ./development_database/.s.PGSQL.5432,
to which programs run in the development shell will implicitly
automatically connect to.
The development shell ships with both the psql and
pgcli tools to
interact directly with the database. The latter provides additional
features such as syntax highlighting and better auto-completion.
A new local database for the application can be created and its
table can be initialised with:
createdb app
psql app < ./sql/tables.sql
Should the need arise, this database can be deleted with the
following command before being created and initialised again:
Local application server
The server application can be started in development mode
with:
uvicorn \
--reload-dir app \
--reload-dir templates \
--reload \
--app-dir app \
app:main
This server will listen to incoming requests to a locally bound
port. It will automatically reload itself when its files are edited,
and display logs about type checking and runtime errors.
Production deployment
While the deployment phase is way out of the scope of a database
introductory course, asking students to deploy their application on
the Internet could be a motivational factor. It would also
facilitate evaluating and grading projects, avoiding deployment and
testing hassles to the teaching staff.
Standard daemon
The Nix Flake provides a Nix package which can be used to run the
web application in a production context as a daemon,
managed by a standard init system.
Docker container
A Docker
container can also be derived from this package to be deployed on
popular cloud hosting services.
Example of container
deployment
An example of deployment procedure using the free tier provided
by the Heroku hosting service
is given below. This makes use of the skopeo tool to
upload the Docker container to the service.
# Log in to an Heroku account.
heroku login
# Create a new project with a random name, hosted somewhere in Europe.
heroku create --region eu
# Set a local environment variable with the assigned name of the project so
# that the next commands operate on it.
export HEROKU_APP='name of the created app'
# Set a randomly-generated signed cookie secret key for our application.
heroku config:set COOKIE_SECRET_KEY=$(pwgen --secure 128 1)
# Attach a PostgreSQL database to the newly created app.
# This sets a connection URL in "DATABASE_URL" in the server's environment,
# containing the confidential database username and password.
heroku addons:create heroku-postgresql:hobby-dev --version=13
# Create the tables in the database.
heroku psql < ./sql/tables.sql
# Prepare a Docker container.
# This creates a Docker archive streaming script as `./result`.
nix build .#docker
# Log in to the Heroku container registry to upload our container.
skopeo login --username _ --password $(heroku auth:token) registry.heroku.com
# Upload the Docker image to Heroku (uploading about ~200MB).
./result \
| gzip --fast \
| skopeo --insecure-policy copy \
docker-archive:/dev/stdin \
docker://registry.heroku.com/$HEROKU_APP/web
# Deploy and launch the uploaded container.
heroku container:release web
# If all went well, the app should now be deployed and accessible on
# https://$HEROKU_APP.herokuapp.com
heroku open
# If not, logs can be remotely inspected for debugging.
heroku logs --tail
Conclusion
It took approximatively three days of work to the author to
architecture and implement the demo project, with prior basic
knowledge of SQL and Python, but with no prior familiarity with the
used libraries (FastAPI, Jinja, psycopg, nor embrace). The extensive
documentation of those dependencies has provided examples to quickly
obtain a working application. The tooling and development
environment allowed a quick, friction-less setup phase.
The level of abstraction provided by the framework and libraries
allowed focusing on the database schema and queries aspects, without
needing to develop a deep understanding of the intricate protocols
and standards linked to web systems. Their APIs did not impose, but
did encourage the developer to follow some best-practices regarding
the structure (separation of concerns) and the security (proper
query and template parameterisation, session handling) aspects of
the application.
Copyright and licensing
Copyright (C) 2021 Pacien TRAN-GIRARD.
This project is distributed under the terms of European Union
Public Licence version 1.2, a copy of which is provided in
./licence.txt.