Project Icon

DeepImage-an-Image-to-Image-technology

强大而多样化的图像生成与转换技术集合

DeepImage是一个综合性的图像生成与转换技术项目,包含多种先进算法如pix2pixHD、pix2pix和CycleGAN等。该项目提供了图像生成演示、理论研究资料和实践指南,涵盖从基础到前沿的生成对抗网络(GAN)技术。DeepImage为研究人员和开发者提供了一个全面的学习和实验平台,助力探索图像生成与转换的多种可能性。

DeepImage-an-Image-to-Image-technology

中文版 | English Version

This warehouse contains the pix2pixHD (proposed by Nvidia) algorithm, and more importantly, the universal image generation theory and practical research behind it.

This resource includes the TensorFlow2 (Pytorch | PaddlePaddle) implementation of image generation models such as pix2pix, CycleGAN, UGATIT, DCGAN, SinGAN, VAE, ALAE, mGANprior and StarGAN-v2, which can be used to systematically learn to Generating Adversarial Network (GAN).


Content of this resource

  1. What is DeepImage?
  2. Fake Image Generation and Image-to-Image Demo
  3. DeepImage Algorithm: Normal to Pornography Image
  4. NSFW: Pornography to Normal Image, Pornographic Image Detection
  5. GAN Image Generation Theoretical Research
  6. GAN Image Generation Practice Research
  7. DeepImage to DeepFakes
  8. Future

What is DeepImage?

DeepImage uses a slightly modified version of the pix2pixHD GAN architecture, quoted from DeepImage_official. pix2pixHD is a general-purpose Image2Image technology proposed by NVIDIA. Obviously, DeepImage is the wrong application of artificial intelligence technology, but it uses Image2Image technology for researchers and developers working in other fields such as fashion, film and visual effects.


Fake Image Generation Demo

This section provides a fake image generation demo that you can use as you wish. They are fake images generated by StyleGAN without any copyright issues. Note: Each time you refresh the page, a new fake image will be generated, pay attention to save!

Image-to-Image Demo

This section provides a demo of Image-to-Image Demo: Black and white stick figures to colorful faces, cats, shoes, handbags. DeepImage software mainly uses Image-to-Image technology, which theoretically converts the images you enter into any image you want. You can experience Image-to-Image technology in your browser by clicking Image-to-Image Demo below.

Try Image-to-faces Demo

Try Image-to-Image Demo

An example of using this demo is as follows:

In the left side box, draw a cat as you imagine, and then click the process button, you can output a model generated cat.


:underage: DeepImage Algorithm

DeepImage is a pornographic software that is forbidden by minors. If you are not interested in DeepImage itself, you can skip this section and see the general Image-to-Image theory and practice in the following chapters.

DeepImage_software_itself content:

  1. Official DeepImage Algorithm(Based on Pytorch)
  2. DeepImage software usage process and evaluation of advantages and disadvantages.

:+1: NSFW

Recognition and conversion of five types of images [porn, hentai, sexy, natural, drawings]. Correct application of image-to-image technology.

NSFW(Not Safe/Suitable For Work) is a large-scale image dataset containing five categories of images [porn, hentai, sexy, natural, drawings]. Here, CycleGAN is used to convert different types of images, such as porn->natural.

  1. Click to try pornographic image detection Demo
  2. Click Start NSFW Research

Image Generation Theoretical Research

This section describes DeepImage-related AI/Deep Learning theory (especially computer vision) research. If you like to read the paper and use the latest papers, enjoy it.

  1. Click here to systematically understand GAN
  2. Click here to systematically image-to-image-papers

1. Pix2Pix

Result

Image-to-Image Translation with Conditional Adversarial Networks is a general solution for the use of conditional confrontation networks as an image-to-image conversion problem proposed by the University of Berkeley.

View more paper studies (Click the black arrow on the left to expand)

2. Pix2PixHD

DeepImage mainly uses this Image-to-Image(Pix2PixHD) technology.

Result

Get high resolution images from the semantic map. The semantic graph is a color picture. The different color blocks on the map represent different kinds of objects, such as pedestrians, cars, traffic signs, buildings, and so on. Pix2PixHD takes a semantic map as input and produces a high-resolution, realistic image. Most of the previous techniques can only produce rough, low-resolution images that don't look real. This research has produced images with a resolution of 2k by 1k, which is very close to full HD photos.

3. CycleGAN

Result

CycleGAN uses a cycle consistency loss to enable training without the need for paired data. In other words, it can translate from one domain to another without a one-to-one mapping between the source and target domain. This opens up the possibility to do a lot of interesting tasks like photo-enhancement, image colorization, style transfer, etc. All you need is the source and the target dataset.

4. UGATIT

Result

UGATIT is a novel method for unsupervised image-to-image translation, which incorporates a new attention module and a new learnable normalization function in an end-to-end manner. UGATIT can do both image conversions that require Holistic Changes, and image conversions that require Large Shape Changes. It can be seen as an enhanced version of CycleGAN, a more efficient general image conversion framework.

5. StyleGAN

Result

Source A + Source B (Style) = ?

StyleGAN can not only generate fake images source A and source B, but also combine the content of source A and source B from different strengths, as shown in the following table.

Style level (from source b)Source ASource B
High level (coarse)all colors (eyes, hair, light) and details facial features from Source Ainherit advanced facial features from Source B, such as posture, general hair style, facial shape and glasses
Medium levelposture, general facial shape and glasses come from source ainherits the middle level facial features of source B, such as hair style, open / closed eyes
High level (fine)the main facial content comes from source ainherits the advanced facial features of source B, such as color scheme and microstructure

StyleGAN2

Without increasing the amount of calculation of StyleGAN, while solving the image artifacts generated by StyleGAN and obtaining high-quality images with better details, StyleGAN2 implements a new SOTA for unconditional image modeling tasks.

6. Image Inpainting

Result

In the image interface of Image_Inpainting(NVIDIA_2018).mp4 video, you only need to use tools to simply smear the unwanted content in the image. Even if the shape is very irregular, NVIDIA's model can “restore” the image with very realistic The picture fills the smeared blank. It can be described as a one-click P picture, and "no ps traces." The study was based on a team from Nvidia's Guilin Liu et al. who published a deep learning method that could edit images or reconstruct corrupted images, even if the images were worn or lost pixels. This is the current 2018 state-of-the-art approach.

7. SinGAN

ICCV2019 Best paper - Marr prize

Result

We introduce SinGAN, an unconditional generative model that can be learned from a single natural image. Our model is trained to capture the internal distribution of patches within the image, and is then able to generate high quality, diverse samples that carry the same visual content as the image. SinGAN contains a pyramid of fully convolutional GANs, each responsible for learning the patch distribution at a different scale of the image. This allows generating new samples of arbitrary size and aspect ratio, that have significant variability, yet maintain both the global structure and the fine textures of the training image. In contrast to previous single image GAN schemes, our approach is not limited to texture images, and is not conditional (i.e. it generates samples from noise). User studies confirm that the generated samples are commonly confused to be real images. We illustrate the utility of SinGAN in a wide range of image manipulation tasks.

8. ALAE

Result

Although studied extensively, the issues of whether they have the same generative power of GANs, or learn disentangled representations, have not been fully addressed. We introduce an autoencoder that tackles these issues jointly, which we call Adversarial Latent Autoencoder (ALAE). It is a general architecture that can leverage recent improvements on GAN training procedures.

9. mGANprior

Result

Despite the success of Generative Adversarial Networks (GANs) in image synthesis, applying trained GAN models to real image processing remains challenging. Previous methods typically invert a target image back to the latent space either by back-propagation or by learning an additional encoder. However, the reconstructions from both of the methods are far from ideal. In this work, we propose a novel approach, called mGANprior, to incorporate the well-trained GANs as effective prior to a variety of image processing tasks.

10. StarGAN v2

Result

A good image-to-image translation model should learn a mapping between different visual domains while satisfying the following properties: 1) diversity of generated images and 2) scalability over multiple domains. Existing methods address either of the issues, having limited diversity or multiple models for all domains. We propose StarGAN v2, a single framework that tackles both and shows significantly improved results over the baselines.

11. DeepFaceDrawing

Result

Recent deep image-to-image translation techniques allow fast generation of face images from freehand sketches. However, existing solutions tend to overfit to sketches, thus requiring professional sketches or even edge maps as input. To address this issue, our key idea is to implicitly model the shape space of plausible face images and synthesize a face image in this space to approximate an input sketch.


Image Generation Practice Research

These models are based on the latest implementation of TensorFlow2.

This section explains DeepImage-related AI/Deep Learning (especially computer vision) code practices, and if you like to experiment, enjoy them.

1. Pix2Pix

Use the Pix2Pix model (Conditional Adversarial Networks) to implement black and white stick figures to color graphics, flat houses to stereoscopic houses and aerial maps to maps.

Click Start Experience 1

2. Pix2PixHD

Under development... First you can use the official implementation

3. CycleGAN

The CycleGAN neural network model is used to realize the four functions of photo style conversion, photo effect enhancement, landscape season change, and object conversion.

Click Start Experience 3

4. DCGAN

DCGAN is used to achieve random number to image generation tasks, such as face generation.

Click Start Experience 4

5. Variational Autoencoder (VAE)

VAE is used to achieve random number to image generation tasks, such as face generation.

Click Start Experience 5

6. Neural style transfer

Use VGG19 to achieve image style migration

项目侧边栏1项目侧边栏2
推荐项目
Project Cover

豆包MarsCode

豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover

AI写歌

Suno AI是一个革命性的AI音乐创作平台,能在短短30秒内帮助用户创作出一首完整的歌曲。无论是寻找创作灵感还是需要快速制作音乐,Suno AI都是音乐爱好者和专业人士的理想选择。

Project Cover

白日梦AI

白日梦AI提供专注于AI视频生成的多样化功能,包括文生视频、动态画面和形象生成等,帮助用户快速上手,创造专业级内容。

Project Cover

Kimi

Kimi AI助手提供多语言对话支持,能够阅读和理解用户上传的文件内容,解析网页信息,并结合搜索结果为用户提供详尽的答案。无论是日常咨询还是专业问题,Kimi都能以友好、专业的方式提供帮助。

Project Cover

有言AI

有言平台提供一站式AIGC视频创作解决方案,通过智能技术简化视频制作流程。无论是企业宣传还是个人分享,有言都能帮助用户快速、轻松地制作出专业级别的视频内容。

Project Cover

讯飞绘镜

讯飞绘镜是一个支持从创意到完整视频创作的智能平台,用户可以快速生成视频素材并创作独特的音乐视频和故事。平台提供多样化的主题和精选作品,帮助用户探索创意灵感。

Project Cover

讯飞文书

讯飞文书依托讯飞星火大模型,为文书写作者提供从素材筹备到稿件撰写及审稿的全程支持。通过录音智记和以稿写稿等功能,满足事务性工作的高频需求,帮助撰稿人节省精力,提高效率,优化工作与生活。

Project Cover

阿里绘蛙

绘蛙是阿里巴巴集团推出的革命性AI电商营销平台。利用尖端人工智能技术,为商家提供一键生成商品图和营销文案的服务,显著提升内容创作效率和营销效果。适用于淘宝、天猫等电商平台,让商品第一时间被种草。

Project Cover

SubCat字幕猫

SubCat字幕猫APP是一款创新的视频播放器,它将改变您观看视频的方式!SubCat结合了先进的人工智能技术,为您提供即时视频字幕翻译,无论是本地视频还是网络流媒体,让您轻松享受各种语言的内容。

投诉举报邮箱: service@vectorlightyear.com
@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号