Creating a Python Package with Poetry for Beginners Part2

Intro
So far, in the previous blog we covered creating our package with Poetry, managing our development environment and adding a function. In the current blog post we’ll be covering the next steps with package development including documentation, testing and how to publish to PyPI.
Note: I am using my package as an example but not actually publishing it to PyPI.
Documentation
When developing a package, documentation is one of the most important steps. It’s easy to get carried away with the fun of writing packages and functions and forget to document them. There are many reasons to write documentation, some are:
- Purpose: Explains what the code does and why, thinking about this as developer can often help with design.
- Usability: It helps users (and your future self) understand the code.
- Maintenance: It will make debugging and updates easier.
- Standards: All good packages have good documentation. It is one of the key metrics of Litmus, our package validation service.
What Documentation Do We Need?
README
A README file is a short, essential guide that explains your Python package at a glance. It typically includes:
- Project name and description: What the package does and why it’s useful.
- Installation instructions: How to install it (usually with pip).
- Usage examples: Simple code snippets showing how to get started.
- Features or documentation links: What’s included and where to learn more.
- License and contribution info: How others can use or contribute to the project.
In short, the README helps users understand, install, and use your package quickly.
For a good example of a README file, instead of writing one for my package I’m going to point to the pandas README.
Docstrings
Docstrings are short, embedded documentation inside your Python code that explain what functions, classes, or modules do. They typically include:
- Purpose: A brief description of what the function, class, or module does.
- Parameters: Names, types, and descriptions of inputs.
- Returns: The output type and what it is.
- Example usage (optional): A small code snippet showing how to use it.
In short, docstrings make your code understandable, help tools like help()
or IDEs
provide guidance, and serve as the basis for auto-generated API documentation.
For a docstring example I am going to use my function get_season_league
.
Here, we are using the Sphinx markup language to document
the different input parameters and their datatypes, and
any returned values. See the Sphinx documentation for further
information.
def get_season_league(league_id = "485842"):
"""
This function will take your league ID, map over all the members
of your league then return a DF with a week on week league table.
:type league_id: str
:param league_id: ID of the league you are targetting
:returns: Data-frame of the leagues week on week standings
"""
api_url = "https://fantasy.premierleague.com/api/"
Testing
Testing is another very important part of package development that has many benefits. It can be integrated to version control CI pipelines, meaning you can run the tests every time you push some changes to a remote git repository. Some of the benefits of testing are:
- Thinking about tests whilst writing functions will aid development
- Well written tests will catch bugs early
- Ensure consistency between releases
There is lots of resources out there on writing tests for python packages. We have two previous blogs on pytest, an introductory blog and a more advanced one. There are many testing frameworks available for Python, like unittest pytest, or doctest (which runs docstring-embedded examples as software tests). The type of testing you need will often determine the framework you use. The software literature makes distinctions between different types of tests: unit (which we will focus on), integration, end to end, and acceptance tests. The distinction is based on the scope (how much of the software project is run/touched during the tests), isolation (do the tests rely on external services) and viewpoint (do the tests check features from a user’s perspective, or how the software works internally from a developer’s perspective).
Testing My Package
Thankfully my package only has one function so it will be very easy to write a test.
So to begin I’ll create a test file, tests/test_get_league.py
this follows the naming
convention of naming the test file test_module_name
. You may also see test files named
test_function_name
, this will depend on how large your modules are. The goal is for it to be
consistent, easy to understand and ideally split up based on size.
I have added some simple tests for the class of the output, the columns returned and the first event in my default data as this will remain the same. I’m not going to go into detail on how the tests work as we have already done blogs on this as mention above but this is my test:
def test_get_season_league():
output = get_season_league()
# Test pandas DataFrame is produced
assert isinstance(output, pd.DataFrame)
# Test columns are correct
assert list(output.columns) == ["name", "team_name", "event", "points"]
# Test first event as it will remain the same as the data grows
first_event = output.query("name == 'Osheen Macoscar' & event == 1")
assert first_event["points"] == 69
I have written a very surface level test here. My particular function is hard to test as I’m calling an external API, meaning the object will differ each game-week. The API may also go down or the output may change causing the test to fail, when my function hasn’t changed. When touching an external resource ideally I could set up a static response to test (which I could do for certain endpoints) but I can’t with my function as the output is supposed to change throughout the season.
Once we have written our tests we can run pytest
whilst in the top level of our package
to run the test(s), and it will tell you if they have passed or failed.
Publishing to PyPI
As I mentioned at the start of the blog I am not publishing this package to PyPI, however I will show the helpful poetry function that allows us to do it. Note these is also a TestPyPI that you can publish to first to ensure everything runs smoothly.
The main function for this is poetry publish
but there are a few steps we need to take first.
Obviously there is a level of authentication before you can publish, this can be set up by
adding your user specific PyPI token to your config:
poetry config pypi-token.pypi <token>
After you have done this you are clear to publish and can do so with:
poetry publish --build
The build tag at the end just builds the package by creating a the distributable files (a .tar.gz and a .whl) inside the dist/ directory. This is required before publishing the package.
Next Up
This is where I am going to leave the series for now. We have looked at all the basics you need when developing a python package from writing and documenting functions all the way to testing and publishing the package. In the next iteration I may look at building out this package or parallelising the function I’ve written, but it is not scheduled to be written anytime soon.
