Your first contribution to PyMC#

This tutorial will guide you in submitting your first PR to the pymc repository. We have tried to cover all of the steps and to be clear with the expected result at each step is.

You will start cloning the pymc repository, installing the requirements you’ll need to contribute, making some changes to a docstring of your choice and submitting a pull request.

Disclaimer

This is a tutorial as defined in diataxis. Here are some things to take into account:

  • Do I need to follow each step exactly as explained? No, but we strongly recommend you follow them. Once you have submitted your first PR and feel comfortable with the process then you can start experimenting and making it your own.

  • Do I need to submit a docstring edit as my first PR? No, but again, we strongly recommend you do. This will allow you to separate the contribution workflow from the contribution content. We will be able to help you out along the process much better if you follow these steps, as we’ll have a clear idea of where you stand at all times. We believe this will help you get comfortable with the tooling and infrastructure (git, virtual environments, GitHub PRs and CI) as quickly as possible so you can then focus on the content of your next PRs.

  • Will I understand the reasons for each step? No, the goal of this tutorial is to teach you to do and through doing.

If you prefer video to written content you can watch Reshama go over this guide and submit a PR to PyMC in the following video:

Prepare your environment#

If you don’t have your system configured yet, you can follow the instructions in the Local environment setup tutorial.

Choose a docstring#

Go to the PyMC API documentation, click on the module (and submodule if needed) that calls you the most and choose a docstring on which to work.

Note

This tutorial follows the process of updating pymc.Uniform and pairs general comments about updating docstrings with specific comments about applying those changes to pymc.Uniform.

The docstring is available at the Sample docstring page, I updated the docstring in PyMC while writing this guide.

Once you have chosen, go to our issue tracker, check nobody is already working on it and comment that you are going to update it.

Important

Remember that function and keep the tab open (or save the link for later)

Open the file with your text editor#

Open the file containing the docstring you chose to edit with your text editor. The file will be inside the pymc folder, but it probably won’t be straightforward to guess just from the name of the function or class.

Go back to the API page and click on the “[source]” button at the right of the call signature.

source_button

Now take a look at the url. Here is what it shows for pymc.Uniform:

https://www.pymc.io/projects/docs/en/latest/_modules/pymc/distributions/continuous.html#Uniform

The file with the definition of the Uniform class and its docstring is pymc/distributions/continuous.py, the path that comes after _modules with a .py extension.

Edit the docstring#

The changes you have to do are making sure that the docstring is following numpydoc convention. We have some extra conventions on top of that, which I have explained here but they are only relevant for some sections, most of the time you’ll follow numpydoc directly.

Open the numpydoc style guide side by side or in a different window. I am updating the docstring of pymc.Uniform as an example.

You have to review section by section to make sure everything is well documented. If you have chosen a class that is not a distribution (unlike this example where we are working on the pymc.Uniform distribution), you should review the docstrings of all the methods (only if they already exist though, no need to write missing docstrings). Otherwise, you should work only on the function or class docstring. We will therefore ignore the docstrings of the logcdf, get_moment and dist methods.

Section independent comments#

  • Only the short summary section is required. The rest should be used when relevant. As a rule, if a section is missing, ignore it for now. If you think it should be added, take a note and let us know when you open the PR.

  • If you find instances of the plot directive .. plot::, make sure they are either on the extended summary or examples section and that they use the close-figs context. It should look like:

    .. plot::
        :context: close-figs
    
        python code starts here
    

Short summary#

  • General comments: Make sure there is a (preferably single line) short description of the object. In most cases you’ll need to ignore the “not use variable names or the function name” rule.

  • pymc.Uniform class docstring:

Deprecation warning#

  • General comments: There should be no deprecation warnings, we use a decorator for that. If you find a docstring with one, take a note, do not modify it and let us know when you open the PR.

  • pymc.Uniform class docstring:

Extended summary#

  • General comments: This section is quite free and will probably need no modifications other than maybe directive updates or moving some code to the notes section.

  • pymc.Uniform class docstring: Missing close figs in the plot directive.

Parameters#

  • General comments: This is the section that will most probably need more work. Points to add or emphasize in addition to the advise on numpydoc:

    • The colon between an argument name and it’s type must be both preceded and followed by a space.

    • Type hints should go in the call signature, not in the docstring. Optional[Union[str, int]] is not adequate for a docstring, it should be str or int, optional. Type hints target machines, docstrings target humans.

    • Optional parameters must be indicated with , optional or , default <value>. If the default value is of the documented type and used directly, using default instead of optional is preferred. However, if the default value depends on other parameters or is a placeholder (i.e. it is very common to use None for kwarg type arguments) then optional should be used, explaining the default in the description.

    • In type descriptions. We have several aliases available to keep raw docstrings short and clear while generating still a nice html page with all the correct links:

      • tensor_like: One of the most common (if not the most common) aliases. It should be used in all parameters that take an aesara tensor or any object that can be converted to it. In general, you’ll have to change tensor, aesara tensor (including combinations with different capitalization, dot or hyphen in between) to tensor_like.

      • TensorVariable: Similar to tensor_like but should only be used for parameters that won’t be converted internally and need to be Aesara tensors before passing them to the arguments. Use TensorVariable, without extra quotes or backticks. Ask if you are unsure about using tensor_like or TensorVariable.

      • RandomVariable: Change var, random var, aesara var and similar concepts should be RandomVariable

      • array_like: Change array like or array-like to array_like with an underscore. If you encounter this in a returned parameter, note it in the PR description.

      • ndarray: Change np.ndarray or numpy.ndarray to ndarray. However, if you encounter this in an input argument, note it in the PR description.

      • Covariance and Mean: within the gp module only covariance, covariance objects, Covariance instances and the like should be modified to this. Same for Mean

      • InferenceData: change things like arviz.InferenceData or inference data to this.

      • MultiTrace and BaseTrace: change anything containing this in the type to them. The most probable thing to find is pymc.backends.base.MultiTrace

      • Point: change pymc.Point, point and similars to this

      • SMC_kernel: within the smc module change references to kernel, smc kernel and the like to this. Note the underscore and capitalization!

      • Aesara_Op: change Aesara Op, Op and variations to Aesara_Op, note the underscore and capitalization!

      • We might also realize we are missing an important alias thanks to your contributions. Check the conf.py file from time to time to see if there are new aliases not explained here. Aliases are defined in the numpydoc_xref_aliases dict.

  • pymc.Uniform class docstring:

    • There is no space between argument name and colon

    • Both arguments are actually optional. In Distributions, this can’t be seen in the class itself but in the dist method. In the Uniform case, it is dist(cls, lower=0, upper=1, **kwargs). Therefore they are both optional with defaults to 0 and 1 respectively.

    • Input parameters to distributions can be Aesara tensors, scalars or NumPy arrays. Therefore, tensor_like should be used. In this case, the parameters of the distribution are real numbers (as opposed to discrete for example) so we will use tensor_like of float. For discrete parameters we would use tensor_like of int.

Returns and yields#

  • General comments: Nothing to add to numpydoc. They follow the style of the parameters section but with the argument name being optional for single outputs. You should look for the same things detailed in parameters section plus making sure that the type (plus name if any) and the description are on different lines.

  • pymc.Uniform class docstring:

Commit the changes to git and get your PR ready#

Great! You are ready to do your PR now.

You can follow the PR Tutorial which explains how you can do a PR to PyMC.

Your PR will be reviewed and hopefully merged by the PyMC team. After that, you can properly celebrate your first contribution to PyMC! Thanks for contributing!