NextChat icon indicating copy to clipboard operation
NextChat copied to clipboard

[Feature] 支持多模态对话功能:对话的问题和回复都支持文本、图片、音频

Open easeaico opened this issue 2 years ago • 44 comments

你想要什么功能或者有什么建议? 支持多模态对话功能:对话的问题和回复都支持文本、图片、音频。 随着官方 ChatGPT 多模态的推出,期待未来 ChatGPT-Next-Web 有计划支持多模态的对话输入输出。

有没有可以参考的同类竞品? 官方 ChatGPT 多模态的功能

easeaico avatar Oct 29 '23 08:10 easeaico

Bot detected the issue body's language is not English, translate it automatically.


Title: [Feature] Support multi-modal dialogue function: dialogue questions and replies support text, pictures, and audio

**What features do you want or have any suggestions? ** Support multi-modal dialogue function: dialogue questions and replies support text, pictures, and audio. With the launch of official ChatGPT multi-modality, we look forward to ChatGPT-Next-Web planning to support multi-modal conversation input and output in the future.

**Are there any similar competing products that we can refer to? ** Official ChatGPT multi-modal functionality

Issues-translate-bot avatar Oct 29 '23 08:10 Issues-translate-bot

希望支持多模态对话功能:对话的问题和回复都支持文本、图片、音频。

chenminmin4 avatar Nov 01 '23 03:11 chenminmin4

Bot detected the issue body's language is not English, translate it automatically.


Hope to support multi-modal dialogue function: dialogue questions and replies support text, pictures, and audio.

Issues-translate-bot avatar Nov 01 '23 03:11 Issues-translate-bot

希望支持生成图片,语音,

super999 avatar Nov 07 '23 00:11 super999

Bot detected the issue body's language is not English, translate it automatically.


Hope to support generating pictures, voices,

Issues-translate-bot avatar Nov 07 '23 00:11 Issues-translate-bot

GPT-4 Turbo with vision https://openai.com/blog/new-models-and-developer-products-announced-at-devday

gutenye avatar Nov 07 '23 00:11 gutenye

希望支持多模态对话功能:对话的问题和回复都支持文本、图片、音频。同时支持文件上传,服务器缓存文件功能。

xiaosatai avatar Nov 07 '23 00:11 xiaosatai

Bot detected the issue body's language is not English, translate it automatically.


Hope to support multi-modal dialogue function: dialogue questions and replies support text, pictures, and audio. It also supports file upload and server caching file functions.

Issues-translate-bot avatar Nov 07 '23 00:11 Issues-translate-bot

github许愿池

mountainguan avatar Nov 07 '23 03:11 mountainguan

Bot detected the issue body's language is not English, translate it automatically.


github wishing pool

Issues-translate-bot avatar Nov 07 '23 03:11 Issues-translate-bot

@Yidadaa would you appreciate any work on this (e.g. PoC) that you can leverage? Or are you in the middle of a major refactoring of the related app architecture?

DirkSchlossmacher avatar Nov 09 '23 11:11 DirkSchlossmacher

多模态太重要了, 会带来无限可能.

Avey777 avatar Nov 13 '23 10:11 Avey777

Bot detected the issue body's language is not English, translate it automatically.


Multimodality is so important and will bring endless possibilities.

Issues-translate-bot avatar Nov 13 '23 10:11 Issues-translate-bot

Multi-modal support is desired: support for external file uploads and server caching, and support for conversations and replies in text, images, audio, and video.

ikexue avatar Nov 16 '23 03:11 ikexue

gpt-4-vision-preview 无法显示完整对话,估计是 openai 的 bug。需要传 max_token: 4096

可以参考这里:feat: 支持 gpt-4-vision-preview

xcatliu avatar Nov 17 '23 03:11 xcatliu

Bot detected the issue body's language is not English, translate it automatically.


gpt-4-vision-preview cannot display the complete dialogue, which is probably an openai bug. Need to pass max_token: 4096

You can refer here:

https://github.com/xcatliu/Chatgpt-sxt/commit/64e893BC09BCFA5B62BAD461A488CBDCF1 #Diff-C92AE8BA73287976525D66897 EC86C7DA0A555C871123A0DEABA2F6R170

Issues-translate-bot avatar Nov 17 '23 03:11 Issues-translate-bot

希望支持语音对话的功能。

yinbc avatar Nov 21 '23 06:11 yinbc

Bot detected the issue body's language is not English, translate it automatically.


Hope to support voice conversation function.

Issues-translate-bot avatar Nov 21 '23 06:11 Issues-translate-bot

Same.

zhuozhiyongde avatar Nov 26 '23 13:11 zhuozhiyongde

希望增加dell-a,还有对图像的支持。

jqjhl avatar Nov 26 '23 18:11 jqjhl

Bot detected the issue body's language is not English, translate it automatically.


I hope to add dell-a and support for images.

Issues-translate-bot avatar Nov 26 '23 18:11 Issues-translate-bot

希望增加dell-a,还有对图像的支持。

Model DALL·E must use own storage service for stored a image, because when you generating a image, the image will disappear in 30 minute ~ 2 hours (approx)

H0llyW00dzZ avatar Nov 26 '23 21:11 H0llyW00dzZ

希望增加dell-a,还有对图像的支持。

Model DALL·E must use own storage service for stored a image, because when you generating a image, the image will disappear in 30 minute ~ 2 hours (approx)

Isn't "own storage" what the app already has built-in: an Upstash Redis DB integration - now for chat messages backup, but could be extended for persisting images, at least if downscaled

DirkSchlossmacher avatar Nov 27 '23 07:11 DirkSchlossmacher

希望增加dell-a,还有对图像的支持。

Model DALL·E must use own storage service for stored a image, because when you generating a image, the image will disappear in 30 minute ~ 2 hours (approx)

Isn't "own storage" what the app already has built-in: an Upstash Redis DB integration - now for chat messages backup, but could be extended for persisting images, at least if downscaled

for image

In scenarios where an image is generated by DALL-E, the response is a JSON containing only the image URL, along with the date, time, and revised prompts (specifically for DALL-E 3). Since the image URL has a limited download duration, it would be beneficial if the image URL from DALL-E models could be automatically downloaded and then uploaded to a storage solution that allows for image display in markdown format.

H0llyW00dzZ avatar Nov 27 '23 07:11 H0llyW00dzZ

同类竞品: OpenCat

gutenye avatar Nov 27 '23 08:11 gutenye

Bot detected the issue body's language is not English, translate it automatically.


Competing products: OpenCat

Issues-translate-bot avatar Nov 27 '23 08:11 Issues-translate-bot

希望增加DALL·E模型用于生成图片 对话支持多模态,支持图片 音频 文档

lph66152137 avatar Nov 27 '23 09:11 lph66152137

Bot detected the issue body's language is not English, translate it automatically.


It is hoped that the DALL·E model can be added to generate pictures. The dialogue supports multi-modality and supports pictures, audio documents.

Issues-translate-bot avatar Nov 27 '23 09:11 Issues-translate-bot

已实现dall-e-3画图、gpt4-vision-preview识图、whisper语音转文字、tts文字转语音 https://github.com/vual/ChatGPT-Next-Web-Pro

vual avatar Nov 30 '23 08:11 vual

Bot detected the issue body's language is not English, translate it automatically.


My side supports the gpt4-vision-preview image recognition function, you can check it out. https://github.com/vual/ChatGPT-Next-Web-Pro

Issues-translate-bot avatar Nov 30 '23 08:11 Issues-translate-bot