Project Icon

numalogic

开源时间序列分析和异常检测框架

numalogic是一个开源的机器学习框架,专注于运营数据分析和AIOps。该框架集成了多种ML模型和算法,提供预测性数据分析、模型选择、数据处理和特征提取功能。numalogic适用于部署失败检测、系统故障识别、欺诈检测等场景。它支持实时训练,可根据输入数据自动更新模型,适合构建持续运行的ML平台。numalogic设计简洁,易于使用和扩展,为数据分析提供了灵活的解决方案。

numalogic

Build codecov black License slack Release Version

Background

Numalogic is a collection of ML models and algorithms for operation data analytics and AIOps. At Intuit, we use Numalogic at scale for continuous real-time data enrichment including anomaly scoring. We assign an anomaly score (ML inference) to any time-series datum/event/message we receive on our streaming platform (say, Kafka). 95% of our data sets are time-series, and we have a complex flowchart to execute ML inference on our high throughput sources. We run multiple models on the same datum, say a model that is sensitive towards +ve sentiments, another more tuned towards -ve sentiments, and another optimized for neutral sentiments. We also have a couple of ML models trained for the same data source to provide more accurate scores based on the data density in our model store. An ensemble of models is required because some composite keys in the data tend to be less dense than others, e.g., forgot-password interaction is less frequent than a status check interaction. At runtime, for each datum that arrives, models are picked based on a conditional forwarding filter set on the data density. ML engineers need to worry about only their inference container; they do not have to worry about data movement and quality assurance.

Numalogic realtime training

For an always-on ML platform, the key requirement is the ability to train or retrain models automatically based on the incoming messages. The composite key built at per message runtime looks for a matching model, and if the model turns out to be stale or missing, an automatic retriggering is applied. The conditional forwarding feature of the platform improves the development velocity of the ML developer when they have to make a decision whether to forward the result further or drop it after a trigger request.

Key Features

  1. Ease of use: simple and efficient tools for predictive data analytics
  2. Reusability: all the functionalities can be re-used in various contexts
  3. Model selection: easy to compare, validate, fine-tune and choose the model that works best with each data set
  4. Data processing: readily available feature extraction, scaling, transforming and normalization tools
  5. Extensibility: adding your own functions or extending over the existing capabilities
  6. Model Storage: out-of-the-box support for MLFlow and support for other model ML lifecycle management tools

Use Cases

  1. Deployment failure detection
  2. System failure detection for node failures or crashes
  3. Fraud detection
  4. Network intrusion detection
  5. Forecasting on time series data

Getting Started

For set-up information and running your first pipeline using numalogic, please see our getting started guide.

Installation

Numalogic requires Python 3.8 or higher.

Prerequisites

Numalogic needs PyTorch and PyTorch Lightning to work. But since these packages are platform dependendent, they are not included in the numalogic package itself. Kindly install them first.

Numalogic supports pytorch versions 2.0.0 and above.

numalogic can be installed using pip.

pip install numalogic

If using mlflow for model registry, install using:

pip install numalogic[mlflow]

Build locally

  1. Install Poetry:
    curl -sSL https://install.python-poetry.org | python3 -
    
  2. To activate virtual env:
    poetry shell
    
  3. To install dependencies:
    poetry install --with dev,torch
    
    If extra dependencies are needed:
    poetry install --all-extras
    
  4. To run unit tests:
    make test
    
  5. To format code style using black and ruff:
    make lint
    
  6. Setup pre-commit hooks:
    pre-commit install
    

Contributing

We would love contributions in the numalogic project in one of the following (but not limited to) areas:

  • Adding new time series anomaly detection models
  • Making it easier to add user's custom models
  • Support for additional model registry frameworks

For contribution guildelines please refer here.

Resources

项目侧边栏1项目侧边栏2
推荐项目
Project Cover

豆包MarsCode

豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover

AI写歌

Suno AI是一个革命性的AI音乐创作平台,能在短短30秒内帮助用户创作出一首完整的歌曲。无论是寻找创作灵感还是需要快速制作音乐,Suno AI都是音乐爱好者和专业人士的理想选择。

Project Cover

有言AI

有言平台提供一站式AIGC视频创作解决方案,通过智能技术简化视频制作流程。无论是企业宣传还是个人分享,有言都能帮助用户快速、轻松地制作出专业级别的视频内容。

Project Cover

Kimi

Kimi AI助手提供多语言对话支持,能够阅读和理解用户上传的文件内容,解析网页信息,并结合搜索结果为用户提供详尽的答案。无论是日常咨询还是专业问题,Kimi都能以友好、专业的方式提供帮助。

Project Cover

阿里绘蛙

绘蛙是阿里巴巴集团推出的革命性AI电商营销平台。利用尖端人工智能技术,为商家提供一键生成商品图和营销文案的服务,显著提升内容创作效率和营销效果。适用于淘宝、天猫等电商平台,让商品第一时间被种草。

Project Cover

吐司

探索Tensor.Art平台的独特AI模型,免费访问各种图像生成与AI训练工具,从Stable Diffusion等基础模型开始,轻松实现创新图像生成。体验前沿的AI技术,推动个人和企业的创新发展。

Project Cover

SubCat字幕猫

SubCat字幕猫APP是一款创新的视频播放器,它将改变您观看视频的方式!SubCat结合了先进的人工智能技术,为您提供即时视频字幕翻译,无论是本地视频还是网络流媒体,让您轻松享受各种语言的内容。

Project Cover

美间AI

美间AI创意设计平台,利用前沿AI技术,为设计师和营销人员提供一站式设计解决方案。从智能海报到3D效果图,再到文案生成,美间让创意设计更简单、更高效。

Project Cover

AIWritePaper论文写作

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

投诉举报邮箱: service@vectorlightyear.com
@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号