Project Icon

Text2Poster-ICASSP-22

基于AI的智能海报布局生成系统

Text2Poster-ICASSP-22是一个基于人工智能的海报生成系统,结合了图像检索和文本布局技术。该项目能够根据输入文本自动选择合适的背景图片并进行智能排版,支持多语言输入。系统提供API接口,便于快速集成和使用。Text2Poster适用于各种需要快速生成高质量海报的场景,如活动宣传、营销推广等。

Text2Poster-ICASSP-22

The inference code of the ICASSP-2022 paper "Text2Poster: Laying Out Stylized Texts on Retrieved Images".

framework

Paper Link: https://arxiv.org/abs/2301.02363

Star History

Star History Chart

If you like this project, please give it a star! It would be a great encouragement for me and help me to continue improving it.

Quick Start from API

Just run the following code to quick start:

import time, json, requests
timestamp = time.strftime('%Y%m%d%H%M%S',time.localtime(time.time())) 

input_text_elements = {
    "sentences": [
        ["CHILDREN'S DAY", 90], # [text, font_size]
        ["Children are The Future of Nation", 50] # [text, font_size]
    ],
    "background_query": "Children's Day!" # sentence used to retrieve background images.
}

input_text_elements = json.dumps(input_text_elements)
api_url = "http://bl.mmd.ac.cn:8889/text2poster"
response = requests.get(api_url, params = {"input_text_elements": input_text_elements})
if response.status_code == 200:
    f = open("poster-{}.jpg".format(timestamp), "wb")
    f.write(response.content)
    f.close()
    print("Save poster to:", "poster-{}.jpg".format(timestamp))
else:
    print(response.text)

News

  • [2023.1.24] Update "http://1.13.255.9" to "http://bl.mmd.ac.cn".
  • [2023.1.17] We provide an API for Text2Poster, you can quickly start our Text2Poster without any resource download.
  • [2023.1.16] We add a machine translation API to translate all the input texts into Chinese. So that the BriVL model could deal with all languages. Now you can try to retrieve background images in any language!!
  • [2023.1.15] We provide an unsplash image download script in ./background_retriever/unsplash_image_downloader.py, you can use this script to get those background image files based on the retrieved image urls.
  • [2023.1.14] We provide our image retrieval source code and data in ./background_retriever for the convenience of people who are not in mainland China to use our Text2Poster. You can retrieve the background image locally, which requires about 3GB of GPU memory.
  • [2023.01.10] We update the background image retrieval website to http://1.13.255.9:8889. The original website buling.wudaoai.cn has been retired in 2023.01.09.

Generated Posters:

poster poster

More Examples

  • input text elements 1

55, 40 and 30 are the font size.

{
    "sentences": [
        ["冬日初雪舞会", 55],
        ["雪花飞舞,像音乐和歌声围绕", 40],
        ["与朋友相聚,享受欢乐时光,我们不见不散", 30]
    ],
    "background_query": "冬日初雪舞会"
}
  • output posters

poster poster

  • input text elements 2

80 and 55 are the font size.

{
    "sentences": [
        ["ICASSP 2022", 80],
        ["May 22 - 27, 2022, Singapore", 55]
    ],
    "background_query": "Singapore"
}

output posters

poster poster

  • input text elements 3

90 and 50 are the font size.

{
    "sentences": [
        ["桜が咲く", 90],
        ["出会いは素晴らしい春に", 50]
    ],
    "background_query": "春の美しい桜"
}

output posters

poster poster

Start from Source Code

Install

We recommend you use anaconda to run our Text2Poster. Run the following command to install the dependent libraries:

bash install_package.sh

you also can install the dependent libraries manually:

# using the tsinghua mirror to speed up the install.
conda install pytorch=1.10.0 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
conda install torchvision=0.11.0 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
pip install opencv_contrib_python
pip install transformers==3.2.0
pip install argparse
pip install freetype-py
pip install requests
pip install jsonlines
pip install tqdm
pip install pyyaml
pip install easydict
pip install timm

Download

We provide the following resource to start Text2Poster:

  • Weights of layout refine model: ./checkpoint/0.20484_Cascading_128_uniform_big.pth;
  • Weights of layout distribution prediction model: ./checkpoint/27.80619_distribCNN_BigPosition_epoch_76_scale_20.pth;

[Not required] Although we provide an API for background image retrieval, if you want to retrieve background images from the source code, you need to download the following resources:

  • Weights of text encoder of BriVL: brivl-textencoder-weights.pth -> ./background_retriever/weights/;
  • Unsplash images features (extracted by BriVL): wenlan_unsplash_feats.npy -> ./background_retriever/background_feats/;
  • URL of background images: ./background_retriever/background_feats/unsplash_image_url.jsonl.

Running

We provide two example, Run the following command to run our Text2Poster:

bash run.sh

Some parameters:

  • input_text_file: The input text elements, it contains: 1). sentences (phase) and their font size, 2). query used to retrieve background images.
  • output_folder: The folder to save the output posters and some process figures.
  • background_folder: The folder to save local background images, If images are not saved locally, they will be downloaded from remote.
  • top_n: Arrange the text elements on the top N retrieved images.
  • save_process: Save the process figure (etc. saliency map) or not.

We also provide the following examples:

  • background image retrieval (from API)
python background_retrieval.py
  • background image retrieval (from source code)
cd background_retriever
python main.py
  • Layout distribution prediction
python layout_distribution_predict.py
  • Layout refinement
python layout_refine.py
  • Download images from Unsplash
python ./background_retriever/unsplash_image_downloader.py

Some Output During Process

we also output some intermediate processing files in ./example/outputs:

-SdD0KbD7N0 saliency_map_with-smooth

  • Right image: The original background image.
  • Left image: Saliency map (blue) with smooth region map (red).

layout_distribution saliency_map_with-smooth

  • Right image: The prediction of layout distribution map.
  • Left image: Saliency map (blue) with predicted layout distribution map (red).

initial_layout refined_layout

  • Right image: Initial layout map.
  • Left image: Refined layout map.

Blue region: The saliency map;

Green region: The predicted layout distribution map;

Red region: the predicted layout map.

Tips

Something about our background image retrieval

Requirements

python==3.7
pytorch=1.10.0
torchvision=0.11.0
transformers==3.2.0
freetype-py
opencv_contrib_python
requests
jsonlines
tqdm
argparse
pyyaml
easydict
timm

Citation

If you find this paper and repo useful, please cite us in your work:

@inproceedings{DBLP:conf/icassp/JinXSL22,
  author    = {Chuhao Jin and
               Hongteng Xu and
               Ruihua Song and
               Zhiwu Lu},
  title     = {Text2Poster: Laying Out Stylized Texts on Retrieved Images},
  booktitle = {{IEEE} International Conference on Acoustics, Speech and Signal Processing,
               {ICASSP} 2022, Virtual and Singapore, 23-27 May 2022},
  pages     = {4823--4827},
  publisher = {{IEEE}},
  year      = {2022},
  url       = {https://doi.org/10.1109/ICASSP43922.2022.9747465},
  doi       = {10.1109/ICASSP43922.2022.9747465},
  timestamp = {Tue, 07 Jun 2022 17:34:56 +0200},
  biburl    = {https://dblp.org/rec/conf/icassp/JinXSL22.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Contact

My Email is: jinchuhao@ruc.edu.cn

项目侧边栏1项目侧边栏2
推荐项目
Project Cover

豆包MarsCode

豆包 MarsCode 是一款革命性的编程助手,通过AI技术提供代码补全、单测生成、代码解释和智能问答等功能,支持100+编程语言,与主流编辑器无缝集成,显著提升开发效率和代码质量。

Project Cover

AI写歌

Suno AI是一个革命性的AI音乐创作平台,能在短短30秒内帮助用户创作出一首完整的歌曲。无论是寻找创作灵感还是需要快速制作音乐,Suno AI都是音乐爱好者和专业人士的理想选择。

Project Cover

有言AI

有言平台提供一站式AIGC视频创作解决方案,通过智能技术简化视频制作流程。无论是企业宣传还是个人分享,有言都能帮助用户快速、轻松地制作出专业级别的视频内容。

Project Cover

Kimi

Kimi AI助手提供多语言对话支持,能够阅读和理解用户上传的文件内容,解析网页信息,并结合搜索结果为用户提供详尽的答案。无论是日常咨询还是专业问题,Kimi都能以友好、专业的方式提供帮助。

Project Cover

阿里绘蛙

绘蛙是阿里巴巴集团推出的革命性AI电商营销平台。利用尖端人工智能技术,为商家提供一键生成商品图和营销文案的服务,显著提升内容创作效率和营销效果。适用于淘宝、天猫等电商平台,让商品第一时间被种草。

Project Cover

吐司

探索Tensor.Art平台的独特AI模型,免费访问各种图像生成与AI训练工具,从Stable Diffusion等基础模型开始,轻松实现创新图像生成。体验前沿的AI技术,推动个人和企业的创新发展。

Project Cover

SubCat字幕猫

SubCat字幕猫APP是一款创新的视频播放器,它将改变您观看视频的方式!SubCat结合了先进的人工智能技术,为您提供即时视频字幕翻译,无论是本地视频还是网络流媒体,让您轻松享受各种语言的内容。

Project Cover

美间AI

美间AI创意设计平台,利用前沿AI技术,为设计师和营销人员提供一站式设计解决方案。从智能海报到3D效果图,再到文案生成,美间让创意设计更简单、更高效。

Project Cover

AIWritePaper论文写作

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

投诉举报邮箱: service@vectorlightyear.com
@2024 懂AI·鲁ICP备2024100362号-6·鲁公网安备37021002001498号