Project Icon

pymc

Python贝叶斯统计建模与概率编程框架

PyMC是一个Python贝叶斯统计建模框架,专注于高级马尔可夫链蒙特卡洛和变分推断算法。它提供直观的模型语法、强大的采样算法和推断功能,可处理复杂模型。PyMC利用PyTensor优化计算,支持缺失值处理,并提供丰富的示例资源。作为一个灵活的概率编程工具,PyMC适用于广泛的统计建模任务。

.. image:: https://cdn.rawgit.com/pymc-devs/pymc/main/docs/logos/svg/PyMC_banner.svg :height: 100px :alt: PyMC logo :align: center

|Build Status| |Coverage| |NumFOCUS_badge| |Binder| |Dockerhub| |DOIzenodo| |Conda Downloads|

PyMC (formerly PyMC3) is a Python package for Bayesian statistical modeling focusing on advanced Markov chain Monte Carlo (MCMC) and variational inference (VI) algorithms. Its flexibility and extensibility make it applicable to a large suite of problems.

Check out the PyMC overview <https://docs.pymc.io/en/latest/learn/core_notebooks/pymc_overview.html>, or one of the many examples <https://www.pymc.io/projects/examples/en/latest/gallery.html>! For questions on PyMC, head on over to our PyMC Discourse <https://discourse.pymc.io/>__ forum.

Features

  • Intuitive model specification syntax, for example, x ~ N(0,1) translates to x = Normal('x',0,1)
  • Powerful sampling algorithms, such as the No U-Turn Sampler <http://www.jmlr.org/papers/v15/hoffman14a.html>__, allow complex models with thousands of parameters with little specialized knowledge of fitting algorithms.
  • Variational inference: ADVI <http://www.jmlr.org/papers/v18/16-107.html>__ for fast approximate posterior estimation as well as mini-batch ADVI for large data sets.
  • Relies on PyTensor <https://pytensor.readthedocs.io/en/latest/>__ which provides:
    • Computation optimization and dynamic C or JAX compilation
    • NumPy broadcasting and advanced indexing
    • Linear algebra operators
    • Simple extensibility
  • Transparent support for missing value imputation

Linear Regression Example

Plant growth can be influenced by multiple factors, and understanding these relationships is crucial for optimizing agricultural practices.

Imagine we conduct an experiment to predict the growth of a plant based on different environmental variables.

.. code-block:: python

import pymc as pm

Taking draws from a normal distribution

seed = 42 x_dist = pm.Normal.dist(shape=(100, 3)) x_data = pm.draw(x_dist, random_seed=seed)

Independent Variables:

Sunlight Hours: Number of hours the plant is exposed to sunlight daily.

Water Amount: Daily water amount given to the plant (in milliliters).

Soil Nitrogen Content: Percentage of nitrogen content in the soil.

Dependent Variable:

Plant Growth (y): Measured as the increase in plant height (in centimeters) over a certain period.

Define coordinate values for all dimensions of the data

coords={ "trial": range(100), "features": ["sunlight hours", "water amount", "soil nitrogen"], }

Define generative model

with pm.Model(coords=coords) as generative_model: x = pm.Data("x", x_data, dims=["trial", "features"])

  # Model parameters
  betas = pm.Normal("betas", dims="features")
  sigma = pm.HalfNormal("sigma")

  # Linear model
  mu = x @ betas

  # Likelihood
  # Assuming we measure deviation of each plant from baseline
  plant_growth = pm.Normal("plant growth", mu, sigma, dims="trial")

Generating data from model by fixing parameters

fixed_parameters = { "betas": [5, 20, 2], "sigma": 0.5, } with pm.do(generative_model, fixed_parameters) as synthetic_model: idata = pm.sample_prior_predictive(random_seed=seed) # Sample from prior predictive distribution. synthetic_y = idata.prior["plant growth"].sel(draw=0, chain=0)

Infer parameters conditioned on observed data

with pm.observe(generative_model, {"plant growth": synthetic_y}) as inference_model: idata = pm.sample(random_seed=seed)

  summary = pm.stats.summary(idata, var_names=["betas", "sigma"])
  print(summary)

From the summary, we can see that the mean of the inferred parameters are very close to the fixed parameters

===================== ====== ===== ======== ========= =========== ========= ========== ========== ======= Params mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat ===================== ====== ===== ======== ========= =========== ========= ========== ========== ======= betas[sunlight hours] 4.972 0.054 4.866 5.066 0.001 0.001 3003 1257 1 betas[water amount] 19.963 0.051 19.872 20.062 0.001 0.001 3112 1658 1 betas[soil nitrogen] 1.994 0.055 1.899 2.107 0.001 0.001 3221 1559 1 sigma 0.511 0.037 0.438 0.575 0.001 0 2945 1522 1 ===================== ====== ===== ======== ========= =========== ========= ========== ========== =======

.. code-block:: python

Simulate new data conditioned on inferred parameters

new_x_data = pm.draw( pm.Normal.dist(shape=(3, 3)), random_seed=seed, ) new_coords = coords | {"trial": [0, 1, 2]}

with inference_model: pm.set_data({"x": new_x_data}, coords=new_coords) pm.sample_posterior_predictive( idata, predictions=True, extend_inferencedata=True, random_seed=seed, )

pm.stats.summary(idata.predictions, kind="stats")

The new data conditioned on inferred parameters would look like:

================ ======== ======= ======== ========= Output mean sd hdi_3% hdi_97% ================ ======== ======= ======== ========= plant growth[0] 14.229 0.515 13.325 15.272 plant growth[1] 24.418 0.511 23.428 25.326 plant growth[2] -6.747 0.511 -7.740 -5.797 ================ ======== ======= ======== =========

.. code-block:: python

Simulate new data, under a scenario where the first beta is zero

with pm.do( inference_model, {inference_model["betas"]: inference_model["betas"] * [0, 1, 1]}, ) as plant_growth_model: new_predictions = pm.sample_posterior_predictive( idata, predictions=True, random_seed=seed, )

pm.stats.summary(new_predictions, kind="stats")

The new data, under the above scenario would look like:

================ ======== ======= ======== ========= Output mean sd hdi_3% hdi_97% ================ ======== ======= ======== ========= plant growth[0] 12.149 0.515 11.193 13.135 plant growth[1] 29.809 0.508 28.832 30.717 plant growth[2] -0.131 0.507 -1.121 0.791 ================ ======== ======= ======== =========

Getting started

If you already know about Bayesian statistics:

  • API quickstart guide <https://www.pymc.io/projects/examples/en/latest/introductory/api_quickstart.html>__
  • The PyMC tutorial <https://docs.pymc.io/en/latest/learn/core_notebooks/pymc_overview.html>__
  • PyMC examples <https://www.pymc.io/projects/examples/en/latest/gallery.html>__ and the API reference <https://docs.pymc.io/en/stable/api.html>__

Learn Bayesian statistics with a book together with PyMC

  • Bayesian Analysis with Python <http://bap.com.ar/>__ (third edition) by Osvaldo Martin: Great introductory book.
  • Probabilistic Programming and Bayesian Methods for Hackers <https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers>__: Fantastic book with many applied code examples.
  • PyMC port of the book "Doing Bayesian Data Analysis" by John Kruschke <https://github.com/cluhmann/DBDA-python>__ as well as the first edition <https://github.com/aloctavodia/Doing_bayesian_data_analysis>__.
  • PyMC port of the book "Statistical Rethinking A Bayesian Course with Examples in R and Stan" by Richard McElreath <https://github.com/pymc-devs/resources/tree/master/Rethinking>__
  • PyMC port of the book "Bayesian Cognitive Modeling" by Michael Lee and EJ Wagenmakers <https://github.com/pymc-devs/resources/tree/master/BCM>__: Focused on using Bayesian statistics in cognitive modeling.

Audio & Video

  • Here is a YouTube playlist <https://www.youtube.com/playlist?list=PL1Ma_1DBbE82OVW8Fz_6Ts1oOeyOAiovy>__ gathering several talks on PyMC.
  • You can also find all the talks given at PyMCon 2020 here <https://discourse.pymc.io/c/pymcon/2020talks/15>__.
  • The "Learning Bayesian Statistics" podcast <https://www.learnbayesstats.com/>__ helps you discover and stay up-to-date with the vast Bayesian community. Bonus: it's hosted by Alex Andorra, one of the PyMC core devs!

Installation

To install PyMC on your system, follow the instructions on the installation guide <https://www.pymc.io/projects/docs/en/latest/installation.html>__.

Citing PyMC

Please choose from the following:

  • |DOIpaper| PyMC: A Modern and Comprehensive Probabilistic Programming Framework in Python, Abril-Pla O, Andreani V, Carroll C, Dong L, Fonnesbeck CJ, Kochurov M, Kumar R, Lao J, Luhmann CC, Martin OA, Osthege M, Vieira R, Wiecki T, Zinkov R. (2023)
  • |DOIzenodo| A DOI for all versions.
  • DOIs for specific versions are shown on Zenodo and under Releases <https://github.com/pymc-devs/pymc/releases>_

.. |DOIpaper| image:: https://img.shields.io/badge/DOI-10.7717%2Fpeerj--cs.1516-blue.svg :target: https://doi.org/10.7717/peerj-cs.1516 .. |DOIzenodo| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.4603970.svg :target: https://doi.org/10.5281/zenodo.4603970

Contact

We are using discourse.pymc.io <https://discourse.pymc.io/>__ as our main communication channel.

To ask a question regarding modeling or usage of PyMC we encourage posting to our Discourse forum under the “Questions” Category <https://discourse.pymc.io/c/questions>. You can also suggest feature in the “Development” Category <https://discourse.pymc.io/c/development>.

You can also follow us on these social media platforms for updates and other announcements:

  • LinkedIn @pymc <https://www.linkedin.com/company/pymc/>__
  • YouTube @PyMCDevelopers <https://www.youtube.com/c/PyMCDevelopers>__
  • X @pymc_devs <https://x.com/pymc_devs>__
  • Mastodon @pymc@bayes.club <https://bayes.club/@pymc>__

To report an issue with PyMC please use the issue tracker <https://github.com/pymc-devs/pymc/issues>__.

Finally, if you need to get in touch for non-technical information about the project, send us an e-mail <info@pymc-devs.org>__.

License

Apache License, Version 2.0 <https://github.com/pymc-devs/pymc/blob/main/LICENSE>__

Software using PyMC

General purpose

  • Bambi <https://github.com/bambinos/bambi>__: BAyesian Model-Building Interface (BAMBI) in Python.
  • calibr8 <https://calibr8.readthedocs.io>__: A toolbox for constructing detailed observation models to be used as likelihoods in PyMC.
  • gumbi <https://github.com/JohnGoertz/Gumbi>__: A high-level interface for building GP models.
  • SunODE <https://github.com/aseyboldt/sunode>__: Fast ODE solver, much faster than the one that comes with PyMC.
  • pymc-learn <https://github.com/pymc-learn/pymc-learn>__: Custom PyMC models built on top of pymc3_models/scikit-learn API

Domain specific

  • Exoplanet <https://github.com/dfm/exoplanet>__: a toolkit for modeling of transit and/or radial velocity observations of exoplanets and other astronomical time series.
  • beat <https://github.com/hvasbath/beat>__: Bayesian Earthquake Analysis Tool.
  • CausalPy <https://github.com/pymc-labs/CausalPy>__: A package focussing on causal inference in quasi-experimental settings.

Please contact us if your software is not listed here.

Papers citing PyMC

See Google Scholar here <https://scholar.google.com/scholar?cites=6357998555684300962>__ and here <https://scholar.google.com/scholar?cites=6936955228135731011>__ for a continuously updated list.

Contributors

See the GitHub contributor page <https://github.com/pymc-devs/pymc/graphs/contributors>. Also read our Code of Conduct <https://github.com/pymc-devs/pymc/blob/main/CODE_OF_CONDUCT.md> guidelines for a better contributing experience.

Support

PyMC is a non-profit project under NumFOCUS umbrella. If you want to support PyMC financially, you can donate here <https://numfocus.salsalabs.org/donate-to-pymc3/index.html>__.

Professional Consulting Support

You can get professional consulting support from PyMC Labs <https://www.pymc-labs.io>__.

Sponsors

|NumFOCUS|

|PyMCLabs|

|Mistplay|

|ODSC|

Thanks to our contributors

|contributors|

.. |Binder| image:: https://mybinder.org/badge_logo.svg :target: https://mybinder.org/v2/gh/pymc-devs/pymc/main?filepath=%2Fdocs%2Fsource%2Fnotebooks .. |Build Status| image:: https://github.com/pymc-devs/pymc/workflows/pytest/badge.svg :target: https://github.com/pymc-devs/pymc/actions .. |Coverage| image:: https://codecov.io/gh/pymc-devs/pymc/branch/main/graph/badge.svg :target: https://codecov.io/gh/pymc-devs/pymc .. |Dockerhub| image:: https://img.shields.io/docker/automated/pymc/pymc.svg :target: https://hub.docker.com/r/pymc/pymc .. |NumFOCUS_badge| image:: https://img.shields.io/badge/powered%20by-NumFOCUS-orange.svg?style=flat&colorA=E1523D&colorB=007D8A :target: http://www.numfocus.org/ .. |NumFOCUS| image:: https://github.com/pymc-devs/brand/blob/main/sponsors/sponsor_logos/sponsor_numfocus.png?raw=true :target: http://www.numfocus.org/ .. |PyMCLabs| image:: https://github.com/pymc-devs/brand/blob/main/sponsors/sponsor_logos/sponsor_pymc_labs.png?raw=true :target: https://pymc-labs.io .. |Mistplay| image:: https://github.com/pymc-devs/brand/blob/main/sponsors/sponsor_logos/sponsor_mistplay.png?raw=true :target: https://www.mistplay.com/ .. |ODSC| image:: https://github.com/pymc-devs/brand/blob/main/sponsors/sponsor_logos/odsc/sponsor_odsc.png?raw=true :target: https://odsc.com/california/?utm_source=pymc&utm_medium=referral .. |contributors| image:: https://contrib.rocks/image?repo=pymc-devs/pymc :target: https://github.com/pymc-devs/pymc/graphs/contributors .. |Conda Downloads| image:: https://anaconda.org/conda-forge/pymc/badges/downloads.svg :target:

项目侧边栏1项目侧边栏2
推荐项目
Project Cover

豆包MarsCode

豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover

问小白

问小白是一个基于 DeepSeek R1 模型的智能对话平台,专为用户提供高效、贴心的对话体验。实时在线,支持深度思考和联网搜索。免费不限次数,帮用户写作、创作、分析和规划,各种任务随时完成!

Project Cover

白日梦AI

白日梦AI提供专注于AI视频生成的多样化功能,包括文生视频、动态画面和形象生成等,帮助用户快速上手,创造专业级内容。

Project Cover

有言AI

有言平台提供一站式AIGC视频创作解决方案,通过智能技术简化视频制作流程。无论是企业宣传还是个人分享,有言都能帮助用户快速、轻松地制作出专业级别的视频内容。

Project Cover

讯飞绘镜

讯飞绘镜是一个支持从创意到完整视频创作的智能平台,用户可以快速生成视频素材并创作独特的音乐视频和故事。平台提供多样化的主题和精选作品,帮助用户探索创意灵感。

Project Cover

讯飞文书

讯飞文书依托讯飞星火大模型,为文书写作者提供从素材筹备到稿件撰写及审稿的全程支持。通过录音智记和以稿写稿等功能,满足事务性工作的高频需求,帮助撰稿人节省精力,提高效率,优化工作与生活。

Project Cover

阿里绘蛙

绘蛙是阿里巴巴集团推出的革命性AI电商营销平台。利用尖端人工智能技术,为商家提供一键生成商品图和营销文案的服务,显著提升内容创作效率和营销效果。适用于淘宝、天猫等电商平台,让商品第一时间被种草。

Project Cover

Trae

Trae是一种自适应的集成开发环境(IDE),通过自动化和多元协作改变开发流程。利用Trae,团队能够更快速、精确地编写和部署代码,从而提高编程效率和项目交付速度。Trae具备上下文感知和代码自动完成功能,是提升开发效率的理想工具。

Project Cover

AIWritePaper论文写作

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

投诉举报邮箱: service@vectorlightyear.com
@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号