We all develop python code - small or big, whether it is related to an Application or a package, the way in which the Python is written ultimately plays a major role in deciding it’s usibilty. Thus assuming if I were to build a solution to a machine learning problem, however I cram my entire solution into one .py file. The good things are :
The bad things are :-
I hope the following points seem to give you a hint as to where I am going. Writing good code in python is a must, be it any domain you are working in, but writing code which takes time to edit, test and scale is a big NO, because then this product is restricted to the person who develops it. Simply put it’s like having a doctor write a prescription to you compared with that which is printed.
So with that in mind, let is understand how we can make an project we develop better simply by changing it’s package structure.
The overall structure of a Python package should look something like this
project --- src
--- tests
--- dependency-list
--- docs
--- run_main.py
--- run_tests.py
--- setup.py
How does this help you may ask, well this :-
The dependency list in python usually is named requirements.txt
. And contains all dependent modules which the project requires. Unlike other languages, Python dependencies are far simpler to use and understand.
Not so surprisingly there exists a python package pipreqs
which automatically creates a project requirement file
USER:~ username$ pip install pipreqs
USER:~ username$ pipreqs /path/to/project
Here is an example of one
numpy==1.13.1
matplotlib==2.0.2
seaborn==0.8.1
The setup.py
files helps install dependencies for the following project and if it is a package, helps install the package from pip
An example of a setup.py file is shown below, taken from flockos.
import sys
from setuptools import setup, find_packages
NAME = "flockos"
VERSION = "1.2.0"
# To install the library, run the following
#
# python setup.py install
#
# prerequisite: setuptools
# http://pypi.python.org/pypi/setuptools
REQUIRES = ["requests", "six >= 1.10", "PyJWT"]
setup(
name=NAME,
version=VERSION,
description="Flock API",
author_email="",
url="",
keywords=["Flock API"],
install_requires=REQUIRES,
packages=find_packages(),
include_package_data=True,
long_description="""\
Integrate your apps with flock using this API
"""
)
Keeping a docs folder in your project allowws github to generate a documentation website for the following project. These websites are generally static, although some people do still develop them the conventional way
This include the main part of the program. This can be split into various sub-directories based pn functionality. Once again the idea is to separate code into various bins so that it is easier to understand and scale, so you could expect an src package of the following form
src --- utils
--- core-functions
--- error-handling
--- (other-modules)
--- __init__.py
the __init__.py
file here, helps define functions and objects which are to be exported. A project without this file would be able to export all of it’s objects and functions, however it can allow all functions and objects to be accessed. Thus it becomes important to define what all modules are to be exported.
And thus in summary we have learnt what makes projects robust and how simply designing the way files appear and are written in a project makes a lot of difference. Package building architecture can be extended to other languages too such as Go, JavaScript where each file may have different names, however the basic outline remains the same. Hope this article helps you in building better scalable projects/packaged which can be debugged quickly and tested efficiently
Cheers!
- Jayakrishna Sahit